Topic modelling is a type of statistical modelling for discovering the abstract “topics” that occur in a collection of documents. Latent Dirichlet Allocation (LDA) is an example of a topic model and is used to classify text in a document to a particular topic. It builds a topic per document model and words per topic model, modelled as Dirichlet distributions. Here, we are...
Understanding Latent Dirichlet Allocation (LDA)
Imagine a large law firm takes over a smaller law firm and tries to identify the documents corresponding to different types of cases such as civil or criminal cases which the smaller firm has dealt or is currently dealing with. The presumption is that the documents are not already classified by the smaller law firm. An intuitive way of identifying the documents in such situations is to look for...