Skip to content

astoanne/dockerize_jupyter_notebook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dockerize a Jupyter Notebook File(Windows) 開發及部署詳細說明

Notebook Used:

Emotion Classification in Short Messages

Multi-class sentiment analysis problem to classify texts into five emotion categories: joy, sadness, anger, fear, neutral by using transfer learning using BERT (tensorflow keras).

Usage:

image

Run the Notebook Locally

Make sure the code can run normally in local environment. Take notes on the libraries it uses, for my case:pandas, numpy, ktrain which uses keras.

image

Download Docker

Install Docker Desktop

image

Verify Installation

$ docker version

image

Decide Base Image

There are several jupyter-notebook-related base images. Since our notebook uses pandas, numpy, ktrain(uses keras which runs on top of tensorflow), tensorflow-notebook will be our choice, since it includes everything in jupyter/scipy-notebook(which has pandas, numpy included).

Docker Stacks

image

Create an Empty requirements.txt File (update later)

$ touch requirements.txt

Writing Dockerfile

In the same folder of our notebook, we create our Dockerfile:

FROM jupyter/tensorflow-notebook
COPY requirements.txt ./requirements.txt
COPY bert.ipynb ./bert.ipynb
COPY data ./data
COPY bert_model ./bert_model
RUN pip install -r requirements.txt

which uses the base image, and copies all files into our container.

Build the Dockerfile

$ docker build -t textemotionotebook .

Make Sure Image is Created

$ docker images

image

Run the Image

Running with admin priviledges:

$ docker run -it -p 8888:8888 textemotionotebook

image

Use the last link, e.g. http://127.0.0.1:8888/lab?token=...

image

Notebook Opened in Docker

image

Run the notebook

There should be warning suggesting the missing library: ktrain.

Update the Requirements File

Update requirements.txt to include the the required library

ktrain==0.32.3

Or simply,

ktrain

After rebuilding the docker and rerunning the image, the notebook can be run successfully.

image

image

Getting Image ID

First find the image ID by running docker images

image

Tag the Image

$ docker tag d3e2d3a63640 astolo/textemotionotebook:first

Push the Image to DockerHub

$ docker push astolo/textemotionotebook:first

Should be able to see the image in DockerHub 's My Profile

image

Pull the Image From Docker Hub

$ docker pull astolo/textemotionotebook:first

Run the Image

Running as administrator:

$ docker run -it -p 8888:8888 astolo/textemotionotebook:first

Use the bottom link, the notebook should run successfully.

To Verify My Assignment 使用指南

Running as administrator in Powershell:

$ docker pull astolo/textemotionotebook:first
$ docker run -it -p 8888:8888 astolo/textemotionotebook:first

Use the bottom link, e.g.http://127.0.0.1:8888/lab?token=...

If you are not ready to wait for 10 hours to retrain the model, start from tag "RUN FROM HERE" in the notebook and use the pretrained model.

Other Works I Did

I first used an audio-related notebook, which improvises a pieze of jazz music. Later I found out that it was not trivial to play audio inside docker after going through a large number of tutorials trying to find a way to listen to my generated midi music file inside docker.

I also tried to shrink the size of my image by using more than one dockerfile and seperating the build and package stage, or using docker-compose but soon made the file looks pretty ugly and unreadable, which I believe was not the intension of this assignment: to dockerize a simple application. Plus most tutorials focus on nodejs or c or java based projects which are more natural to break down into stages compared to jupyter-notebook projects.

Side Notes

If you are to git clone this repo, the notebook has to run from the top. Because there is a size limitation for github, I was unable to upload the .h5 file of the pretrained model, but if you

$ docker pull astolo/textemotionotebook:first
$ docker run -it -p 8888:8888 astolo/textemotionotebook:first

there should not be a problem running from the tag "RUN FROM HERE".

About

Assignment Code to Cloud Computing Course: D22091100982

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published