A.I, Data and Software Engineering

Search results forone hot

One-hot encoding matrices demonstration

This post will demonstrate onehot encoding for a rating matrix, such as movie lens dataset. One-hot encoding Previously, we introduced a quick note for one-hot encoding. It is a representation of categorical variables as binary vectors. It is a group of bits among which the legal combinations of values are only those with a single high (1) bit and all the others low (0) Rating matrix If you are...

One-hot encoding quick note

petamind

Quickly grasp the concept of one-hot encoding by simple data and coding. Categorical VS Numerical data Categorical data are variables that contain label values. For example, A “colour” variable can have values “red“, “green” and “blue“. Here, “red”, “green”, “blue” are labels represented by strings. Numerical data are...

Dealing with missing data

petamind

In real-world data, there are some instances where a particular element is absent because of various reasons, such as corrupt data, failure to load the information, or incomplete extraction. Handling the missing values is one of the greatest challenges faced by analysts because making the right decision on how to handle it generates robust data models. Let us look at different ways of imputing...

Feature Engineering FundamentalS

petamind

The features you use influence more than everything else the result. No algorithm alone, to my knowledge, can supplement the information gain given by correct feature engineering.— Luca Massaron What is a feature and why we need engineering of it? Basically, all machine learning algorithms use some input data to create outputs. This input data comprise features, which are usually in the form...

Word2vec with TensorFlow 2.0 – a simple CBOW implementation

petamind

In TensorFlow website, there is a good example of word embedding implementation with Keras. Nevertheless, we are curious to see how it looks like when implementing word2vec with PURE TensorFlow 2.0. What is CBOW In the previous article, we introduced Word2vec (w2v) with Gensim library. Word2vec consists of two-layer neural networks that are trained to reconstruct linguistic contexts of words. The...

A.I, Data and Software Engineering

PetaMinds focuses on developing the coolest topics in data science, A.I, and programming, and make them so digestible for everyone to learn and create amazing applications in a short time.

Categories