A.I, Data and Software Engineering

deep learning: Linear Autoencoder with Keras

d

This post introduces using linear autoencoder for dimensionality reduction using TensorFlow and Keras.

What is a linear autoencoder

An autoencoder is a type of artificial neural network used to learn efficient data codings in an unsupervised manner. The aim of an autoencoder is to learn a representation (encoding) for a set of data, typically for dimensionality reduction, by training the network to ignore signal “noise”.

autoencoder schema
Figure 1: Schema of a basic Autoencoder

Autoencoders consists of two main parts: encoder and decoder (figure 1). They work by automatically encoding data based on input values, then performing an activation function, and finally decoding the data for output. A bottleneck (the h layer(s)) of some sort imposed on the input features, compressing them into fewer categories. Thus, if some inherent structure exists within the data, the autoencoder model will identify and leverage it to get the output.

A linear autoencoder uses zero or more linear activation function in its layers.

other variations

  • Denoising AutoEncoders: Another regularization technique in which we take a modified version of our input values with some of our input values turned in to 0 randomly.
  • Sparse AutoEncoders: Where the hidden layer is greater than the input layer but a regularization technique is applied to reduce overfitting. Adds a constraint on the loss function, preventing the autoencoder from using all its nodes at a time.
  • Contractive AutoEncoders: Adds a penalty to the loss function to prevent overfitting and copying of values when the hidden layer is greater than the input layer.
  • Stacked AutoEncoders: When you add another hidden layer, you get a stacked autoencoder. It has 2 stages of encoding and 1 stage of decoding.

When to use

  • Dimensionality reduction/Feature detection
  • Building powerful recommendation systems
  • Encoding features in massive datasets

Note that: Under a certain circumstance, the solutions for linear autoencoders are those provided by PCA. We will cover PCA in another post. For an encoder on graph data, follow this link.

example with Keras and TF.2.x

We first generate two-class data with 3 dimensions. Then we will use a linear autoencoder to encode (compress) the input data into 2-dimensional data.

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
%tensorflow_version 2.x
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Sequential

The data is generated by sklearn.datasets.

from sklearn.datasets import make_blobs
data = make_blobs(n_samples=100, n_features=3, centers=2, random_state=101)

We now process data with a minmaxscaler. For each value in a feature, MinMaxScaler subtracts the minimum value in the feature and then divides by the range. The range is the difference between the original maximum and original minimum.

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(data[0])#only the data (not the labels)

Now, we plot the data:

from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
data_x = scaled_data[:,0]#col 1
data_y = scaled_data[:,1]#col 2
data_z = scaled_data[:,2]#col 3
ax.scatter(data_x, data_y, data_z, c=data[1])
linear data for autoencoder
The data is linearly separable

Build an autoencoder model

num_inputs = 3 #input dimensions
num_hidden = 2 #output dimensions in hidden layer h
num_outputs = num_inputs #output and input have the same dim

Next, we build the model from the defined parameters. As this is a linear one, we don’t use any activation function.

inputs = Input(num_inputs)
hidden = Dense(num_hidden)
outputs = Dense(num_outputs)
model = Sequential([inputs,hidden, outputs], name='autoencoder')
optimizer = tf.keras.optimizers.Adam(learning_rate)
model.compile(optimizer=optimizer, loss='mean_squared_error', metrics=['mse'])
model.summary()
Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
dense_15 (Dense)             (None, 2)                 8
_________________________________________________________________
dense_16 (Dense)             (None, 3)                 9
=================================================================
Total params: 17
Trainable params: 17
Non-trainable params: 0

Train the model

history = model.fit(x=scaled_data, y=scaled_data, epochs=1000, verbose=False)
plt.plot(history.history['loss'], 'r-', label='loss')
plt.plot(history.history['mse'], 'bo', label='mse')
plt.legend()
training the autoencoder model

Get the encoded data

After training, now we can pass the data and get the output from the encoder or the output of the hidden layer.

encoded = hidden(scaled_data)
encoded.shape
#TensorShape([100, 2])

We can see that the output of the hidden layer has only 2 dimensions. Now we can check if the data is still linearly separable after dimensionality reduction.

plt.scatter(encoded[:,0], encoded[:,1], c=data[1])
after dimensionality reduction

Add comment

💬

A.I, Data and Software Engineering

PetaMinds focuses on developing the coolest topics in data science, A.I, and programming, and make them so digestible for everyone to learn and create amazing applications in a short time.

Categories