In this article, we review some common metrics and their uses for two main ML problems, i.e. regression and classification. Regression Metrics Most of the blogs have focussed on classification metrics like precision, recall, AUC etc. For a change, I wanted to explore all kinds of metrics including those used in regression as well. MAE and RMSE are the two most popular metrics for continuous...

## Latent Dirichlet Allocation (LDA) and Topic ModelLing in Python

Topic modelling is a type of statistical modelling for discovering the abstract “topics” that occur in a collection of documents. Latent Dirichlet Allocation (LDA) is an example of a topic model and is used to classify text in a document to a particular topic. It builds a topic per document model and words per topic model, modelled as Dirichlet distributions. Here, we are...

## K-Means vs K-Nearest neighbours quick note

These are completely different methods in machine learning. The fact that they both have the letter K in their name is a coincidence. K-means is a clustering algorithm that tries to partition a set of points into K sets (clusters) such that the points in each cluster tend to be near each other. It is unsupervised because the points have no external classification. The typical k-means...

## Lasso vs Ridge vs Elastic Net – Machine learning

Lasso, Ridge, and Elastic Net are excellent methods to improve the performance of your linear model. This post will summarise the usage of these regularization techniques. Bias: Biases are the underlying assumptions that are made by data to simplify the target function. Bias does help us generalize the data better and make the model less sensitive to single data points. It also decreases the...

## Understanding Latent Dirichlet Allocation (LDA)

Imagine a large law firm takes over a smaller law firm and tries to identify the documents corresponding to different types of cases such as civil or criminal cases which the smaller firm has dealt or is currently dealing with. The presumption is that the documents are not already classified by the smaller law firm. An intuitive way of identifying the documents in such situations is to look for...