In this article, we assume that you already understand the basic concepts of a convolutional neural network (CNN), e.g. one-hot coding, convolution, pooling, fully-connected layer, activation functions. If you are totally new to these terms, please find and read our other articles.
We will use Tensorflow to build a model for classification of images of CIFAR and test the accuracy.
The CIFAR-10 dataset consists of 60000 32×32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.
Figure 1 shows the general structure of a typical CNN for hand-writing recognition with MNIST dataset. We can reuse this structure for CIFAR dataset for object recognition.
If you are new to graph and session of Tensorflow, you can read this article. We need is to create the following components:
- An input layer
Also, we may need to create some helper functions:
Implementation and test
I recommend testing on Google Colab with GPU enabled. It took only 4 mins to complete the execution while it was 30+ mins testing on my laptop (i5, 3.2ghz, 12gb ram, 32mb GPU of an integrated card).
Below is the notebook which you can code along.