-
Notifications
You must be signed in to change notification settings - Fork 94
/
Copy path05-dockerfiles.Rmd
106 lines (74 loc) · 5.23 KB
/
05-dockerfiles.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
---
title: "Dockerfiles"
output:
dcTemplate::dc_lesson_template:
fig_width: 6
fig_height: 6
highlight: pygments
---
Earlier, we got started with a base image that let us run RStudio from within Docker, and learned to modify the contents of that image using `docker commit`. This is an excellent technique for capturing what we've done so we can reproduce it later, but what if we want to be able to easily change the collection of things in our image, and have a clear record of just what went into it? This is useful when maintaining running environments that may change and evolve over a project, and is facilitated by Dockerfiles.
Dockerfiles are a set of instructions on how to add things to a base image. They build custom images up in a series of *layers*. In a new file called `Dockerfile`, put the following:
```
FROM rocker/verse:latest
```
This tells Docker to start with the `rocker/verse` base image - that's what we've been using so far.
The `FROM` command must always be the first thing in your Dockerfile; this is the bottom crust of the pie we are baking.
Next, let's add another layer on top of our base, in order to have `gapminder` pre-installed and ready to go:
```
RUN R -e "install.packages('gapminder', repos = 'http://cran.us.r-project.org')"
```
`RUN` commands in your Dockerfile execute shell commands to build up your image, like putting the filling in our pie. In this example, we install `gapminder` from the command line using `install.packages`, which does the same thing as if we had done `install.packages('gapminder')` from within RStudio. Save your Dockerfile, and return to your docker terminal; we can now build our image by doing:
```
docker build -t my-r-image .
```
`-t my-r-image` gives our image a name (note image names are always all lower case), and the `.` says all the resources we need to build this image are in our current directory. List your images via:
```
docker images
```
and you should see `my-r-image` in the list. Launch your new image similarly to how we launched the base image:
```
docker run --rm -p 8787:8787 my-r-image
```
Then in the RStudio terminal, try gapminder again:
```
library('gapminder')
gapminder
```
And there it is - gapminder is pre-installed and ready to go in your new docker image.
Our pie is almost complete! All we need to finish it is the topping. In addition to R packages like gapminder, we may also want some some static files inside our Docker image - such as data. We can do this using the `ADD` command in your Dockerfile:
```
ADD data/gapminder-FiveYearData.csv /home/rstudio/
```
Rebuild your Docker image:
```
docker build -t my-r-image .
```
And launch it again:
```
docker run --rm -p 8787:8787 my-r-image
```
Go back to RStudio in the browser, and there `gapminder-FiveYearData.csv` will be, present in the files visible to RStudio. In this way, we can capture files as part of our Docker image, so they're always available along with the rest of our image in the exact same state.
#### Protip: Cached Layers
While building and rebuilding your Docker image in this tutorial, you may have noticed lines like this:
```
Step 2 : RUN R -e "install.packages('gapminder', repos = 'http://cran.us.r-project.org')"
---> Using cache
---> fa9be67b52d1
```
Noting that a cached version of the commands was being used. When you rebuild an image, Docker checks the previous version(s) of that image to see if the same commands were executed previously; each of those steps is preserved as a separate layer, and Docker is smart enough to re-use those layers if they are unchanged and *in the same order* as previously. Therefore, once you've got part of your setup process figured out (particularly if it's a slow part), leave it near the top of your Dockerfile and don't put anything above or between those lines, particularly things that change frequently; this can substantially speed up your build process.
## Summary
In this lesson, we learned how to compose a Dockerfile so that we can re-create our images at will. We learned three main commands:
- `FROM` is always at the top of a Dockerfile, and specifies the image we want to start from.
- `RUN` runs shell commands on top of our base image, and is used for doing things like downloads and installations.
- `ADD` adds files from our computer to our new Docker image.
The image is built by running `docker build -t my-r-image .` in the same directory as our Dockerfile and any files we want to include with an `ADD` command.
## Challenge Questions
- Write a Dockerfile that makes an image like the one you made at the end of Section 3: built on `rocker/verse`, with `gapminder` and `gsl` installed. Also add a readme to your image describing what it contains.
Find the Dockerfile of the `rocker/verse` image through Docker Hub.
- How does having a Dockerfile compare to a README file (like the one we prepared in lesson 3)?
- Docker Hub can automatically build images when you store the file on Github
or Bitbucket. What do you think are the benefits of doing this instead of
commiting the image directly? Could it be negative?
- Find the build details of the `rocker/verse` image. What do they tell us?
Go to [Lesson 06 Share all your analysis](06-Sharing-all-your-analysis.html) or back to the
[main page](http://jsta.github.io/r-docker-tutorial/).