TDM 40200: Project 12 — 2023
Motivation: Containers are everywhere and a very popular method of packaging an application with all of the requisite dependencies. In the previous series of projects you’ve built a web application. While right now it may be easy to share and run your application with another individual, as time goes on and packages are updated, this is less and less likely to be the case. Containerizing your application ensures that the application will have the proper versions of the proper packages available in the proper location to run.
Context: This is a second of a series of projects focused on containers. The end goal of this series is to solidify the concept of a container, and enable you to "containerize" the application you’ve spent the semester building. You will even get the opportunity to deploy your containerized application!
Scope: Python, containers, UNIX
Questions
Question 1
In this project, we will begin learning about the various docker
commands. In addition, we will learn about some of the Dockerfile
commands, and even build and use some simple containers!
Like the last project, we will be working in a virtual machine (VM). By using a VM on Anvil, we are able to ensure that everyone has the proper permissions in order to run each and every one of the necessary commands to build and run Docker images. There are 3 main steps needed in order to both get a VM up and running on Anvil resources, and connect and get a shell on the VM from Anvil.
-
Get a terminal on Anvil — you may complete this part however you like. I like to use
ssh
to connect to Anvil from my local machine, however, you may also use ondemand.anvil.rcac.purdue.edu, launch a Jupyter Lab session, and launch a terminal from within Jupyter Lab. Either works equally as well as the other. -
Clear out any potential SLURM environment variables:
for i in $(env | awk -F= '/SLURM/ {print $1}'); do unset $i; done;
-
Launch SLURM job with 4 cores and about 8 GB of memory and get a shell into the given backend node:
salloc -A cis220051 -p shared -n 4 -c 1 -t 04:00:00
This job will only buy you 4 hours of time on the backend node. If you need more time, you will need to re-launch the job and change the arguments to
salloc
to request more time. -
Once you have a shell on the backend node, you will need to load the
qemu
module:module load qemu
-
Next, copy over a fresh VM image to use for this project:
cp /anvil/projects/tdm/apps/qemu/images/alpine.qcow2 $SCRATCH
If at any time you want to start fresh, you can simply copy over a new VM image from
/anvil/projects/tdm/apps/qemu/images/alpine.qcow2
to your$SCRATCH
directory. Any changes you made to the previous image will be lost. This is good to know in case you want to try something crazy but are worried about breaking something! No need to worry, you can simply re-copy the VM image and start fresh anytime! -
The previous command will result in a new file called
alpinel.qcow2
in your$SCRATCH
directory. This is the VM image you will be using for this project. Now, you will need to launch the VM:qemu-system-x86_64 -vnc none,ipv4 -hda $SCRATCH/alpine.qcow2 -m 8G -smp 4 -enable-kvm -net nic -net user,hostfwd=tcp::2200-:22 &
The last part of the previous command forwards traffic from port 2200 on Anvil to port 22 on the VM. If you receive an error about port 2200 being used, you can change this number to be any other unused port number. To find an unused port you can use a utility we have available to you.
module use /anvil/projects/tdm/opt/core module load tdm find_port
The
find_port
command will output an unused port for you to use. If, for example, it output12345
, then you would change theqemu
command to the following.qemu-system-x86_64 -vnc none,ipv4 -hda $SCRATCH/alpine.qcow2 -m 8G -smp 4 -enable-kvm -net nic -net user,hostfwd=tcp::12345-:22 &
-
After launching the VM, it will be running in the background as a process (this is what the
&
at the end of the command does). After about 15-30 seconds, the VM will be fully booted and you can connect to the VM from Anvil using thessh
command.ssh -p 2200 tdm@localhost -o StrictHostKeyChecking=no
You may be prompted for a password for the user
tdm
. The password is simplypurdue
.If in a previous step you changed the port from say
2200
to something like12345
, you would change thessh
command accordingly. -
Finally, you should be connected to the VM and have a new shell running inside the VM, great! If you were successful, contents of the terminal should look very similar to the following.
For this question, just include a screenshot of your terminal after you have successfully connect to the VM.
If at any time you would like to "save" your progress and restart the project at a later date or time, you can do this by exiting the VM by running the |
-
Code used to solve this problem.
-
Output from running the code.
Question 2
Whoa, you may have noticed things look a little bit different from the previous project — that’s okay! We made a few modifications that will be useful to you during this project. Let’s test out the most useful new feature.
In your terminal in the VM, list all of the files as follows:
ls -la
Okay, nothing special yet. That is to be expected. Now, in the same terminal session, type the letter "l" and immediately pause for a second. You will quickly notice that the terminal shows "s -la" in light grey text after your initially typed "l". We’ve installed a program that remembers your shell history and does its best to predict what you will type based on your previous commands. If you press the right arrow on your keyboard, the rest of the "ls -la" command will be typed out fully. This is an extrememly useful feature, especially as we are juggling various docker
commands that can be long and confusing. For example, you can type "docker" and start typing the up arrow on your keyboard and this tool will cycle through all of your previous commands that started with "docker".
Another way to remember/recall previous commands you’ve run is to open the shell history search interface by holding Ctrl+r and then beginning your search as you type. |
Okay, try running the command docker
and docker ps
and docker images
. Follow these command up with "docker" followed by you pressing the up arrow on your keyboard to cycle through your previous commands. Once you are comfortable with this functionality, go ahead and take a screenshot of some of your outputs from these docker
commands and include it in your submission.
-
Code used to solve this problem.
-
Output from running the code.
Question 3
Next, let’s build a barebones Docker image using the docker
command and a Dockerfile
.
A Dockerfile
is a text file that contains a set of instructions for building a Docker image.
The docker
command is a command line interface (CLI) that allows you to interact with Docker.
A Docker image is a essentially a zipped up tarball of a file system that contains all of the files and dependencies needed to run a program. You can think of it as the ubuntu filesystem that you extracted and used with chroot
in the previous project.
For a very barebones image, your Dockerfile
will only need to contain two lines. The first line will be a FROM
command. This command will tell Docker what base image to build on top of. It is very common to choose a barebones operating system image like alpine
or ubuntu
as the base image.
The second line will be a CMD
command. This command will tell Docker what command to run when a container is started from the image.
FROM OS CMD ["command", "arg1", "arg2"]
Use your favorite command line text editor (the image has nano
, vim
, and emacs
installed already) to create a new file called Dockerfile
in your home directory. Replace "OS" with the base image you want to use. For this project, we will use an ubuntu image from here — specifically the newest stable version of ubuntu, which is currently 22.04
— ubuntu:22.04
. Here, ubuntu
is the repository namespace and 22.04
is the tag specifying a version of the image. While _technically ubuntu could put all sorts of different containers in the ubuntu
namespace under different tags, it is customary to use the tag to specify the version of the image.
Next, replace "command" with the command you want to run when a container is started from the image. For now, let’s use the most basic shell that is available on many linux operating systems, /bin/sh
. If we had multiple arguments to pass to the command, we would add them to the list of arguments after the first argument. For example, if we wanted to run the command echo "hello world"
, we would use the following CMD ["echo", "'hello world'"]
command.
Once complete, save the file and close the text editor. Now, its time to build our image! Run the following command to build the image:
cd
docker build -t myfirstimage .
Here, we are using the |
Once the image is built, you can check to see that it is there by running the following command:
docker images
You will notice that there are 2 images available: ubuntu:22.04
and myfirstimage
. The ubuntu:22.04
image is the base image we used to build our image on top of. The myfirstimage
image is the image we just built. Very cool!
Now, let’s run our image! Run the following command to run our image:
docker run -it myfirstimage
The |
Congratulations! You now have a shell (/bin/sh
) running in a container!
-
Code used to solve this problem.
-
Output from running the code.
Question 4
Okay, now that you have a shell running in the container, let’s take a minute to clarify the last line of our Dockerfile
.
There are two important commands that we need to delineate: CMD
and ENTRYPOINT
. It is kind of a mess, but it is important to take the time to understand the differences — otherwise it will be more difficult to debug your containers in the future.
The following Dockerfile
has a single CMD
line. In a Dockerfile
, there can only be a single CMD
line — if there is more, only the last CMD
line will be respected.
FROM ubuntu:22.04 CMD ["/bin/sh", "-c", "echo cwd: $PWD"]
This CMD
is in exec form. This means that:
-
There is the use of square brackets around the arguments.
-
The first argument is an executable file.
Build the image and run it. What was your output?
docker build -t myfirstimage .
docker run -it myfirstimage
Now, run it and pass in a different command. What was your output?
docker run -it myfirstimage /bin/sh
Modify the Dockerfile
and rebuild your image.
FROM ubuntu:22.04 CMD ["echo", "cwd: $PWD"]
docker build -t myfirstimage .
docker run -it myfirstimage
What happens? Instead of "cwd: /" you get "cwd: $PWD". This is because variable substitution doesn’t occur since a shell isn’t processing the commands. We can, however, use the shell form. This means that:
-
There is no use of square brackets around the arguments.
-
The commands and arguments are passed to the
sh
shell.
FROM ubuntu:22.04 CMD echo "cwd: $PWD"
docker build -t myfirstimage .
docker run -it myfirstimage
You once again get "cwd: /" since the sh
shell is performing the variable substitution! Behind the scenes it is really running /bin/sh -c "echo cwd: $PWD"
.
Finally, there is another series of scenarios that we can explore that have to do with our ENTRYPOINT
. The first being — what happens if we are not using the shell form of CMD
and our first argument is not and executable like echo
or /bin/sh
? Let’s find out!
FROM ubuntu:22.04 CMD ["cwd: $PWD"]
docker build -t myfirstimage .
docker run -it myfirstimage
What happens? It fails! Docker doesn’t understand what to do with that, since it isn’t anything executable. In these scenarios, you need to specify an ENTRYPOINT
.
FROM ubuntu:22.04 CMD ["cwd: $PWD"] ENTRYPOINT ["echo"]
docker build -t myfirstimage .
docker run -it myfirstimage
It works just like when we did the following!
FROM ubuntu:22.04 CMD ["echo", "cwd: $PWD"]
Or does it? In this case, once again there is no variable substitution because a shell is not processing the commands. However, you will find things are different than before. Before, you could run the following:
docker run -it myfirstimage /bin/sh
The result would be that the contents of the CMD
, CMD ["echo", "cwd: $PWD"]
, would be effectively replaced and a shell would be spawned. However, try running it with the following Dockerfile
.
FROM ubuntu:22.04 CMD ["cwd: $PWD"] ENTRYPOINT ["echo"]
docker build -t myfirstimage .
docker run -it myfirstimage /bin/sh
What happens? It does not do what one might expect! It simply prints out "/bin/sh". Why? Well, the arguments after docker run
do not replace our ENTRYPOINT
, just our CMD
. So, in this case, we essentially ran echo /bin/sh
! In fact, if you gave multiple parameters to ENTRYPOINT
— none of them would be replaced.
FROM ubuntu:22.04 CMD ["cwd: $PWD"] ENTRYPOINT ["echo", "$PWD"]
docker build -t myfirstimage .
docker run -it myfirstimage /bin/sh
$PWD /bin/sh
In this example, ENTRYPOINT
is in exec form — it has the square brackets. ENTRYPOINT
also has a shell form.
FROM ubuntu:22.04 CMD ["cwd: $PWD"] ENTRYPOINT echo $PWD
docker build -t myfirstimage .
docker run -it myfirstimage /bin/sh
/
Here, what is actually being run is /bin/sh -c "echo $PWD"
. When ENTRYPOINT
is run using shell form, CMD
is completely ignored, and, because CMD
is completely ignored, the /bin/sh
argument passed as a part of the docker run
command is also ignored. A side effect is that that signals are not passed properly using this method, this will effect stopping the container and the first process running in the container.
Hopefully this gives you a taste of the myriad of capabilities that CMD
and ENTRYPOINT
provide. It is a mess, however, Docker does provide some "best practices". In a nutshell:
-
Stick to the exec forms for both
CMD
andENTRYPOINT
. -
If you want variable substitution to work, directly execute the shell of your choice. Some examples:
DockerfileFROM ubuntu:22.04 CMD ["/bin/bash", "-c", "echo cwd: $PWD"]
DockerfileFROM ubuntu:22.04 ENTRYPOINT ["/bin/sh", "-c"] CMD ["echo cwd: $PWD"]
For this question, submit the text for a Dockerfile
that would result in the following output when run.
docker run -it myfirstimage
/bin/bash
docker run -it myfirstimage /bin/sh
# you get an `sh` shell prompt in the container
docker run -it myfirstimage 'echo $SHELL -- cool'
/bin/bash -- cool
docker run -it myfirstimage "echo $SHELL -- cool"
/bin/zsh -- cool
For this last example — remember taht we are in the |
-
Code used to solve this problem.
-
Output from running the code.
Question 5
Okay, we are making some progress, but the complexity of ENTRYPOINT
and CMD
are certainly enough to slow us down a bit. Rebuild myfirstimage
with the following Dockerfile
.
FROM ubuntu:22.04 CMD ["/bin/bash"]
docker build -t myfirstimage .
docker run -it myfirstimage
Once you are in a shell in your container, go ahead and run the following to create a file called "imhere" in the /root/
directory.
cd /root
touch imhere
Verify that the file exists:
ls -la /root/
Great! Now go ahead and exit
the container. Now, rerun the container, and verify that the file is still there.
docker run -it myfirstimage
ls -la /root/
It is no longer there! The container executes our shell, we run some commands, and as soon as we exit
the container our changes are all gone! Containers are ephemeral — any changes you make will only survive for the duration that the process running the container exists. When we exit
the container the process is no longer running and our changes disappear.
There are a couple of ways around this. One is to use a VOLUME
to bind a location outside of our container somewhere inside our container. We will play with this later on. Another way that is super straightforward is to not let our process exit! We can do this by using the -d
flag to run the container in the background. Give it a try!
docker run -dit myfirstimage
Whoa, this is way different — there is a long string of characters that are printed, and then we have our regular shell prompt. What is going on?
The long string of characters is the container ID. This is a unique identifier for the container. We can use this to interact with the container.
Okay, great, but do we have to remember that? No! We don’t! You can see all running containers using the following command.
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b34dd462b664 myfirstimage "/bin/bash" 40 seconds ago Up 39 seconds goofy_goldwasser
Here, we have an abbreviated container id as well as the name of the image that was used to create the container. Okay, now how do we interact with the container? We can use the docker exec
command. This command allows us to execute a command inside of a running container. In this case, we want a shell, so let’s give it a try.
docker exec -it b34dd462b664 /bin/bash
# or, since b34dd462b664 is hard to type, they give us a user friendly name we can use instead
docker exec -it goofy_goldwasser /bin/bash
Great! Let’s repeat the previous steps to create the file in /root/
called "imhere".
cd /root
touch imhere
ls -la /root/
Now, let’s exit the container. Is the container still running? Use docker ps
to find out. Okay, great! It is still running — that should mean that if we get another shell inside the container and look for the file, it should still be there. Let’s give it a try.
docker exec -it goofy_goldwasser /bin/bash
ls -la /root/
Indeed, it is! Excellent! While this is great, if, for some reason, the container restarted this file would once again disappear. If we really need some data to persist, we need to use a VOLUME
, but we will mess around with this in a future project. What else can we do? What other commands are useful? Well, let’s make our running container stop.
docker stop goofy_goldwasser
Here is a cool feature — we have the shell configured with a |
After that, check on your container with docker ps
— you’ll find it has stopped! Very cool.
Remember how earlier in the project we mentioned that Docker images are just layers of a read-only filesystem compressed into a zipped up tarball? Well, up until this point I haven’t seen any actual transferrable files, have you? With docker
we have the ability to export them using the docker save
command. Let’s do this.
docker save myfirstimage > myfirstimage.tar
# or
docker save -o myfirstimage.tar myfirstimage
The result is a tarball that you can transfer to any other system (at least, any with the same architecture) with Docker
and use docker load
to load it up.
docker load < myfirstimage.tar
docker images
# or
docker load -i myfirstimage.tar
docker images
For this question, just include a screenshot of the terminal contents that demonstrates that you were able to persist the "imhere" file after exiting the container.
-
Code used to solve this problem.
-
Output from running the code.
Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted. In addition, please review our submission guidelines before submitting your project. |