A.I, Data and Software Engineering

Categorydata science

Continue training big models on less powerful devices

6 Min read

It would not be a surprise that you may not have a powerful expensive machine to train a complicate model. You may experience the problem of not enough memory during training in some epoch. This article demonstrates a simple workaround for this. The problem Training deep learning models requires a lot of computing power. For most laptop and desktop today, you can still train the models but it can...

Read on Add comment

Create bipartite graph from a rating matrix

In data science

5 Min read

As deep learning on graphs is trending recently, this article will quickly demonstrate how to use networkx to turn rating matrices, such as MovieLens dataset, into graph data. The rating data We use rating data from the movie lens. The rating data is loaded into rdata which is a Pandas DataFrame. This article demonstrates how to preprocess movie lens data. After processing, the rdata should look...

Read on Add comment

MLP for implicit binary collaborative filtering

In data science, Project, Research

9 Min read

In this post, we demonstrate Keras implementation of the implicit collaborative filtering. We also introduce some techniques to improve the performance of the current model, including weight initialization, dynamic learning rate, early stopping callback etc. The implicit data For demonstration purposes, we use the dataset generated from negative samples using the technique mentioned in this post...

Read on Add comment

Create and distribute your python package

In data science

13 Min read

This is a quick guide for create and generate distribution package of your python project so that others can install, import and use in their projects. Prerequisites: You will need the following tools installed in your computer: Python (2.x/3.x)PipAPI key for uploading your package to distribution platform, such as test.pypi.org. After install python, you can install pip by using: curl -o get-pip...

Read on Add comment

Fast uniform negative sampling for rating matrix

In data science, Research

8 Min read

Sometimes, we want to reduce the training time by using a subset of a very large dataset while the negative samples outnumbers the positive ones, e.g. word embedding. Another situation when we deal with implicit data. In this case, we may need to populate new data for negative values. This post demonstrates how to generate data for training using uniform negative sampling. The data Originally...

Read on 1 comment

Categorydata science

Categories