gcloud beta compute instances create gpu `--zone=us-central1-c `
--machine-type=n1-standard-8 `
--subnet=default `
--service-account=YOURSERVICEACCOUNT-compute@developer.gserviceaccount.com `
--image-family=common-cu110 `
--image-project=deeplearning-platform-release `
--boot-disk-size=50GB `
--scopes=https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/devstorage.full_control `
--accelerator=type=nvidia-tesla-k80,count=1 `
--metadata=install-nvidia-driver=True `
--maintenance-policy=TERMINATE `
--metadata-from-file startup-script=startup-gpu.sh `
--preemptible
I’ve spend some time tinkering with getting a preferred data science stack up. In this post I’m detailing the choices made and could also help you get started on GCP. If you just want to start with programming, go to Google Colab and you’re set to go.
The data science setup is now made up of:
- All development in Python (installed via Miniconda), with Pytorch and Fast.ai for deep learning
- A personal computer with Windows 10, VSCode with Jupyter notebooks functionality
- Google Cloud Platform with a Docker container running Linux Ubuntu
- Github for storing the codebase and a github action based blog running fastpages
For me this works best at the moment. Some remarks on tradeoffs.
Python vs other languages. Python is todays language of choice for data science. It’s just so simple to express ideas in code. Also there are just so many packages available with useful functionality that the whole world is basically an import statement away. Maybe I’ll learn Julia, Kotlin or Swift once, but for now I’m set.
Pytorch vs TensorFlow: Pytorch is so much easier compared to Tensorflow. I remember doing an introduction on Tensorflow and I found it difficult to grasp.
Pytorch vs Fast.ai. Fast.ai is kind of the Keras of Pytorch. It has been a blessing and I do highly recommend it. The online course is great. Also you can get a deeplearning model running in no time. But it’s also an extra layer of abstraction to remember on top of trying to learn Pytorch. So right now I’m now mostly using with Pytorch and Fast.ai once in a while.
Windows vs Linux: OK for data science everyone says Linux is the go to, but I’m just so accustomed to Windows! I can definitely see the advantages of Linux and are slowly gravitating towards using a command line interface more. Windows made a big step with WSL2, so you can now run Linux from within Windows easily, so I did install Ubuntu locally. Maybe in the future I’ll switch fully to Linux, but for now this is working fine.
VSCode vs Jupyter Notebooks. In my opinion the setup of running notebooks within VSCode combines the advantages of a fully fledged IDE with the agile development that notebooks are known for.
Having an own computer vs doing everything in the cloud. Working in the cloud always has a startup of a couple of minutes. My time is limited, so I do prefer to open VSCode and start coding right away. When I need more compute I use the cloud.
GCP vs other cloud platforms. This decision was taken based on the $300 free credit you get with GCP.
Github: Initially I had all the code on my local computer and I used Git for version control. Now that I sometimes iterate between working locally and in the cloud, I store the main branch on Github and push/pull from whichever environment I’m working on.
Fastpages: First I started on Medium, but fastpages is a live saver for publishing from notebooks. It makes it actually fun to blog, instead of a chore duplicating your work in a blog article
Docker: For reproducibility, Docker is king. For cloud computing I think it’s the best way. You just make a Dockerfile and know what you will get. Also it can make the steps to deployment easier.
App deployment: Don’t know what the best is yet. Have tried Render, and could get a webserver it to work with GCP as well. You can even deploy on Binder with a notebook. Guess this one depends on the use case.
The remainer of the post is dedicated to helping you out with setting up GCP with Docker. Some things you need:
- The CLI command from Windows to get a VM running
- A startup script to make sure the VM runs the container
- A Dockerfile to build a docker image from
Let’s start with the CLI command
You can create a GCP VM from the command line interface (CLI) or through the browser based console. Play around with the console to get an idea. Then, on the bottom click gcloud command to see the CLI command to copy in your terminal
This command uses ` at end of line since it’s run from Windows Powershell. If you use Linux, use backslash
Don’t use Container Optimizer OS, as of november 2020 they dont install nvidia container runtime, meaning it’s difficult to make use of the GPU inside the container. An approach that works is to use a data science image like common-cu110 as I’ve used here.
The K80 is the cheapest GPU, good for experimenting.
Also, use –preemtible. You’re VM may be stopped unexpectedly, but it’s about 66% cheaper!
Next up is the startup script…
#!/bin/bash
# first some waiting until gpu drivers are truly installed
while ! [[ -x "$(command -v nvidia-smi)" ]];
do"sleep to check"
echo 5s
sleep
done"nvidia-smi is installed"
echo
while [[ $(command nvidia-smi| cut -c1-10) == "NVIDIA-SMI"* ]];
do"$(command nvidia-smi)"
echo "sleeping to check"
echo 5s
sleep
done"$(command nvidia-smi)"
echo "nvidia-smi drivers are up"
echo
# if you have a persistent disk you can use this to automatically mount it, otherwise remove it
if [ ! -d "/mnt/disks/storage" ]
then-p /mnt/disks/storage
sudo mkdir -o discard,defaults /dev/sdb /mnt/disks/storage
sudo mount +w /mnt/disks/storage
sudo chmod a/etc/fstab /etc/fstab.backup
sudo cp /dev/sdb
sudo blkid =`sudo blkid -s UUID -o value /dev/sdb` /mnt/disks/storage ext4 discard,defaults,nofail 0 2 | sudo tee -a /etc/fstab
echo UUID"mounting complete "
echo else
"not first startup"
echo
fi
# startup your Docker container, with port 6006 mapped to Docker for Tensorboard
-docker
gcloud auth configure-d -p 0.0.0.0:6006:6006 --gpus all --ipc="host" -v /mnt/disks/storage:/ds gcr.io/delta-deck-285906/dockerfile
docker run 'Docker run with GPUs' echo
The first part of the startup script is mainly to wait until the gpu drivers are properly installed. Otherwise, docker run –gpus all will throw an error. Additionally, I like to use a persistent disk. To avoid the hassle of having to mount it every time I startup a new VM, this script does the work for you. Finally the most important is the Docker run instruction. It opens your container with GPU support. The first time you start up your VM it will take some minutes, but afterwards it’s almost immediate.
After this I like to connect to the running container with Vscode Remote-Container Attach to running container command. Checkout the Vscode docs for how to set this up. Basically you need to put the external ip of the VM into your SSH config file and add a line to your settings.json
# settings.json
"docker.host": "ssh://YOURUSER@xxx.xxx.xxx.xxx",
Host xxx.xxx.xxx.xxx
HostName xxx.xxx.xxx.xxx.xxx/to/publicsshkey
IdentityFile localpath
User YOURUSER StrictHostKeyChecking no
One final file to share: the Dockerfile which you can use to build your Docker image
/cuda:10.2-runtime-ubuntu18.04
FROM nvidia
##Set environment variables
=C.UTF-8 LC_ALL=C.UTF-8
ENV LANG
-get update --fix-missing && apt-get install -y wget byobu\
RUN apt\
curl -core \
git-virtualenv \
python3\
unzip && \
-get clean && \
apt-rf /var/lib/apt/lists/*
rm
--quiet https://repo.anaconda.com/miniconda/Miniconda3-py38_4.8.3-Linux-x86_64.sh -O ~/miniconda.sh && \
RUN wget /bin/bash ~/miniconda.sh -b -p /opt/conda && \
~/miniconda.sh && \
rm -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \
ln ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \
echo "conda activate base" >> ~/.bashrc
echo
/opt/conda/bin:$PATH
ENV PATH
--no-cache-dir install --upgrade \
RUN pip \
altair \
ipykernel \
kaggle \
fastbook \
tensorboard \
diskcache && \
-c fastai -c pytorch fastai && \
conda install -y pillow && \
pip uninstall -simd --upgrade && \
pip install pillow-p ds/.kaggle && \
mkdir //github.com/fastai/fastbook.git /ds/fastbook
git clone https:
# Open Ports for Jupyter
# EXPOSE 7745
#Setup File System
=/ds
ENV HOME=/bin/bash
ENV SHELL=/ds/.kaggle
ENV KAGGLE_CONFIG_DIR/ds
VOLUME /ds
WORKDIR
# Make sure the container stays open
-f /dev/null CMD tail
The Docker tutorial by Hamel Husain has helped me greatly, especially the advice to use someone elses dockerfile and start making it your own by gradually adapting. The above dockerfile is based upon his actually.
That’s it, hope it has helped you!