Make use of GG Colab and Jupyter notebook

I decided to share this topic while doing research on Deep Learning on Graph, the latest trend in Deep learning. One of the challenges that I had was to the processing power of my laptop while processing hundreds of thousands of nodes. While buying a new laptop with a good GPU is not cheap, around $2k+ US, I decided to dive into the free platform provided by Google (GG).

URL: https://colab.research.google.com/notebooks/welcome.ipynb#recent=true

What is great about Colab?

Colab is a free cloud service based on Jupyter Notebooks for machine learning education and research. It provides a runtime fully configured for deep learning and free-of-charge access to a robust GPU.

At a glance, Colab offers:

12GB GPU
20-50GB online space for storing data
12 hours runtime*: it is crucial to finish each test within this period

But you can do more with:

Sharing the project with your colleagues
Map your Google drive in Colab VM runtime for notebooks to access
Files can be uploaded (<250MB) or downloaded using scripts
Work with file sync from your computer
Files can be load from Github (< 25MB)

Jupyter notebook in Colab

When moving from an IDE like Visual Studio or Eclipse, many feel uncomfortable with Jupyter because of the suggestion. Nevertheless, Jupyter notebook does provide suggestion and code completion.

For non-colab notebooks:

Tab: to get suggestions
Shift-tab: to get docstring
Shift-Enter: to run the current cell
Ctrl-Shift-P: command mode

For colab notebook:

Ctrl+space: code suggestion and docstring (woohoo).
Other shortcuts are like above

Please note that it is the new feature that not officially released by the time I write this article. You may need to wait for the invitation popup to use the feature.

Accelerate the notebook on colab

Colab notebooks are the handicap of dealing with a runtime that will blow up every 12 hours into space! This is why is so important to speed up the time you need to run your runtime again.

Here are some tips:

Run all cells at once:

Ctrl + F9: run all cells at once

Change runtime type:

You can switch to GPU or TPU for your notebook runtime ( Runtime > Change runtime type > ).

As they are quite similar, stick to GPU as it performs better in some reports. To confirm your notebook running on a GPU:

#' ' means CPU whereas '/device:G:0' means GPU
import tensorflow as tf
tf.test.gpu_device_name()

Change runtime type GPU / TPU of jupyter notebook in GG colab — Change runtime type GPU / TPU

Map your GG Drive:

# This cell imports the drive library and mounts your Google Drive as a VM local drive.
# You can access to your Drive files using this path "/content/gdrive/My Drive/"
from google.colab import drive
drive.mount('/content/gdrive')

Upload/Download files

#Upload - paste the code to a cell:
from google.colab import files
uploaded = files.upload()

#Download generated files - paste code to a cell
from google.colab import files
files.download('file.txt')

Reduce manual interactions

Use automation scripts whenever possible. The following example demonstrates how to pull cuDNN from Nvidia, save the lib for later use (use shell command in jupyter cells)

# Extracts the cuDNN files from Drive folder directly to the VM CUDA folders
!tar -xzvf gdrive/My\ Drive/darknet/cuDNN/cudnn-10.0-linux-x64-v7.5.0.56.tgz -C /usr/local/
!chmod a+r /usr/local/cuda/include/cudnn.h
# Now we check the version we already installed. Can comment this line on future runs
!cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

Copy datasets to VM local filesystem

Colab notebooks sometimes have some lag working with the Drive files. After logging in colab, you will work at “/content/” (check with pwd command). You can move Dataset from google drive to local:

# Copy files from Google Drive to the VM local filesystem
!cp -r "/content/gdrive/My Drive/data.csv" ./data

Hope that you find this post helpful. 🙂

💬Cancel reply

Prevent colab from disconnecting with console js - Petamind says:
October 5, 2021 at 11:20 pm

[…] my Google Colab notebook often gets disconnected after a while, and the data is lost. I searched and found a […]
Quick Benchmark Colab CPU GPU TPU (XLA-CPU) - Petamind says:
October 5, 2021 at 11:22 pm

[…] is on Google Colab with a limited option for TPU on Google compute engine backend. See this post for a quick intro of Google Colab. Specifically, we test on CPU, GPU, and XLA_CPU (accelerated […]