A.I, Data and Software Engineering

TagNLP

Latent Dirichlet Allocation (LDA) and Topic ModelLing in Python

petamind

Topic modelling is a type of statistical modelling for discovering the abstract “topics” that occur in a collection of documents. Latent Dirichlet Allocation (LDA) is an example of a topic model and is used to classify text in a document to a particular topic. It builds a topic per document model and words per topic model, modelled as Dirichlet distributions. Here, we are...

Understanding Latent Dirichlet Allocation (LDA)

petamind

Imagine a large law firm takes over a smaller law firm and tries to identify the documents corresponding to different types of cases such as civil or criminal cases which the smaller firm has dealt or is currently dealing with. The presumption is that the documents are not already classified by the smaller law firm. An intuitive way of identifying the documents in such situations is to look for...

Word2vec with gensim – a simple word embedding example

In this short article, we show a simple example of how to use GenSim and word2vec for word embedding. Word2vec Word2vec is a famous algorithm for natural language processing (NLP) created by Tomas Mikolov teams. It is a group of related models that are used to produce word embeddings, i.e. CBOW and skip-grams. The models are considered shallow. They consist of two-layer neural...

A.I, Data and Software Engineering

PetaMinds focuses on developing the coolest topics in data science, A.I, and programming, and make them so digestible for everyone to learn and create amazing applications in a short time.

Categories