Multi-class sentiment analysis problem to classify texts into five emotion categories: joy, sadness, anger, fear, neutral by using transfer learning using BERT (tensorflow keras).
Make sure the code can run normally in local environment. Take notes on the libraries it uses, for my case:pandas, numpy, ktrain which uses keras.
$ docker version
There are several jupyter-notebook-related base images. Since our notebook uses pandas, numpy, ktrain(uses keras which runs on top of tensorflow), tensorflow-notebook will be our choice, since it includes everything in jupyter/scipy-notebook(which has pandas, numpy included).
$ touch requirements.txt
In the same folder of our notebook, we create our Dockerfile:
FROM jupyter/tensorflow-notebook
COPY requirements.txt ./requirements.txt
COPY bert.ipynb ./bert.ipynb
COPY data ./data
COPY bert_model ./bert_model
RUN pip install -r requirements.txt
which uses the base image, and copies all files into our container.
$ docker build -t textemotionotebook .
$ docker images
Running with admin priviledges:
$ docker run -it -p 8888:8888 textemotionotebook
Use the last link, e.g.
There should be warning suggesting the missing library: ktrain.
Update requirements.txt to include the the required library
Or simply,
After rebuilding the docker and rerunning the image, the notebook can be run successfully.
First find the image ID by running docker images
$ docker tag d3e2d3a63640 astolo/textemotionotebook:first
$ docker push astolo/textemotionotebook:first
Should be able to see the image in DockerHub 's My Profile
$ docker pull astolo/textemotionotebook:first
Running as administrator:
$ docker run -it -p 8888:8888 astolo/textemotionotebook:first
Use the bottom link, the notebook should run successfully.
Running as administrator in Powershell:
$ docker pull astolo/textemotionotebook:first
$ docker run -it -p 8888:8888 astolo/textemotionotebook:first
Use the bottom link, e.g.
If you are not ready to wait for 10 hours to retrain the model, start from tag "RUN FROM HERE" in the notebook and use the pretrained model.
I first used an audio-related notebook, which improvises a pieze of jazz music. Later I found out that it was not trivial to play audio inside docker after going through a large number of tutorials trying to find a way to listen to my generated midi music file inside docker.
I also tried to shrink the size of my image by using more than one dockerfile and seperating the build and package stage, or using docker-compose but soon made the file looks pretty ugly and unreadable, which I believe was not the intension of this assignment: to dockerize a simple application. Plus most tutorials focus on nodejs or c or java based projects which are more natural to break down into stages compared to jupyter-notebook projects.
If you are to git clone
this repo, the notebook has to run from the top. Because there is a size limitation for github, I was unable to upload the .h5
file of the pretrained model, but if you
$ docker pull astolo/textemotionotebook:first
$ docker run -it -p 8888:8888 astolo/textemotionotebook:first
there should not be a problem running from the tag "RUN FROM HERE".