Troubleshooting

Installing and Using DOCKER and NV-DOCKER on CentOS 7

May 5, 2017
10 min read
docker-facebook-share.png

DOCKER-ENGINE is a containerization technology that allows you to create, develop and run applications. In this article we will focus primarily on the basic installation steps for DOCKER and NV-DOCKER, and the ability for DOCKER, working with NV-DOCKER (a wrapper that NVIDIA provides) to provide a stable platform for pulling docker images, which are used to create containers. Containers are 'instances' of an environment, which are created based on the docker image. The containers can be run once, or live on as persistent daemon processes - for which there will be examples of nvidia-docker below.

Installing and getting DOCKER and NV-DOCKER running in CentOS 7 is a straight forward process:

# Assumes CentOS 7 # Assumes NVIDIA Driver is installed as per requirements ( < 340.29 )
# Install DOCKER
sudo curl -fsSL https://get.docker.com/ | sh
# Start DOCKER
sudo systemctl start docker
# Add dockeruser, usermod change
sudo adduser dockeruser
usermod -aG docker dockeruser
# Install NV-DOCKER
# GET NVIDIA-DOCKER
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker-1.0.1-1.x86_64.rpm
# INSTALL
sudo rpm -i /tmp/nvidia-docker*.rpm
# Start NV-DOCKER Service
systemctl start nvidia-docker

After the steps above you should have a running Docker and NVIDIA-DOCKER services.

This can be checked via:

[username@host ~]# systemctl status docker ● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Active: active (running) since Thu 2017-03-23 20:59:01 PDT; 16h ago
..... truncated ....
[username@host ~]# systemctl status nvidia-docker
● nvidia-docker.service - NVIDIA Docker plugin
Loaded: loaded (/usr/lib/systemd/system/nvidia-docker.service; disabled; vendor preset: disabled)
Active: active (running) since Thu 2017-03-23 20:58:59 PDT; 17h ago
..... truncated ....

Using Docker/NVIDIA-DOCKER

Pull and Run your First Container

For a quick and dirty test, using NVIDIA GPUs in a container. This can be done from either sudo/root or su - dockeruser from the install instructions above.

You can run the following example :

# Instantiate a container from the nvidia-docker command. # Note, that nvidia-docker must be run when using any command with docker that involves "run" that you would like to use GPUs with. nvidia-docker is a wrapper that handles setting up the environment (container) in relation to GPUs, GPGPU, Etc. nvidia-docker run --rm nvidia/cuda nvidia-smi

Command Explanation:

  • nvidia-docker - the NVIDIA shim/wrapper that helps setup GPUs with DOCKER
  • run - tells nvidia-docker wrapper that you're going to start (instantiate) a container
    • Note that for any command that does not include 'run' in it, you can simply use docker, but if you use nvidia-docker the command gets passed through to docker (E.g docker images display the docker images on your system, nvidia-docker images would also execute and show the same info)
  • --rm - this tells DOCKER that after the command runs, the container should be stopped/removed
    • This is a very interesting feature/capability. If you think about it, an entire environment is being created, for nvidia-smi to run, and then the container is destroyed. It can be done repeatedly and is very simple and fast.
  • nvidia/cuda - this is the name of an image
    • Note that, the first time you run this command, DOCKER will go out and find an image with that name, and download the docker image from the hub.docker.com repository. This will only happen the first time. You could also run docker pull nvidia/cuda before hand to be verbose and separate the steps. This one-liner works though.
  • nvidia-smi - this is the command to be run on the container

You should get output that looks like the below:

Note that the Pull complete portions (The parts above the nvidia-smi output) are a one-time occurrence as the image is not on your system locally and is being fetched to launch the image into a container instance.

[user@host ~]# nvidia-docker run --rm nvidia/cuda nvidia-smi Using default tag: latest
latest: Pulling from nvidia/cuda
d54efb8db41d: Pull complete
f8b845f45a87: Pull complete
e8db7bf7c39f: Pull complete
9654c40e9079: Pull complete
6d9ef359eaaa: Pull complete
cdfa70f89c10: Pull complete
3208f69d3a8f: Downloading 151.3 MB/421.5 MB
eac0f0483475: Download complete
4580f9c5bac3: Verifying Checksum
6ee6617c19de: Downloading 109 MB/456.1 MB
Fri Mar 24 20:47:52 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48 Driver Version: 367.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 0000:03:00.0 On | N/A |
| 27% 34C P8 7W / 180W | 7725MiB / 8113MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+

Running a persistent container / NVIDIA DIGITS

The following below is meant to demonstrate pulling a DIGITS image and running it in daemon/persistent mode.

It should be noted that in order to use DIGITS you will need to provide its data via the -v command line switch when launching the docker container, which you utilize to map a mount point on the local machine to a mount point within the container, for example: -v /mnt/dataset:/data/dataset This would map /mnt/dataset on the host machine to /data/dataset in the container. When interacting with DIGITS you would be able to see this data when creating datasets, etc from the Web UI.

Running nvidia-docker

[user@host~]# NV_GPU=0,1 nvidia-docker run --name digits -d -p 5000:5000 nvidia/digits 6b12a4107569214a3177304ef2c9db0f333e266d0d766d2c8c02e5bbddd3d444 # This is the Instance ID launched from the nvidia-docker run command

Command Explanation:

  • NV_GPU=0,1
    • This is a method of assigning GPU resources to a container which is critical for leveraging DOCKER in a Multi GPU System. This passes GPU ID 0,1 from the host system to the container as resources. Note that if you passed GPU ID 2,3 for example, the container would still see the GPUs as ID 0,1 inside the container, with the PCI ID of 2,3 from the host system.
  • nvidia-docker - the NVIDIA shim/wrapper that helps setup GPUs with DOCKER
  • run - tells nvidia-docker wrapper that you're going to start (instantiate) a container
    • Note that for any command that does not include 'run' in it, you can simply use docker, but if you use nvidia-docker the command gets passed through to docker (E.g docker images display the docker images on your system, nvidia-docker images would also execute and show the same info)
  • --name digits
    • This names your container instance, you need a unique name for each instance created in this way. It adds another way for instances to be referenced by, the default method is an instance ID Hash
  • -d
    • Instructs DOCKER that this will be a daemonized/persistent container
  • -p 5000:5000
    • This is a way of port mapping. 5000 is being mapped to 5000 for the DIGITS webserver port.
    • If you run multiple containers/instances of DIGITS, for example, you could do -p 5001:5000 for the next container and you would be able to connect to it at the IP_ADDRESS:5001 location, and still connect to IP_ADDRESS:5000 of the other DIGITS container.
  • nvidia/digits
    • Which image we're launching

After running this command, you could connect to DIGITS at the URL of the host system, at port 5000. It would have access to GPU ID 0,1 as resources within the container and within DIGITS in that container. If, for example, this was a 4 GPU machine, you could run the following to create another container, based on that same image, but expose a different port so that the two containers don't conflict with each other, and specify different GPUs so the containers don't try and utilize the same GPGPU resources.

[user@host~]# NV_GPU=2,3 nvidia-docker run --name digits -d -p 5001:5000 nvidia/digits 95e42817050c3e6de88f61473692a71ac0ab0948fe873c06155b95b62dad5554 # Instance ID!

Now you would have another DIGITS instance on port 5001 that would be accessible from a web browser, and this DIGITS installation would have access to the GPU ID 2 and 3 from the host system.

Check Running nvidia-docker Containers

You can check your running containers/instances by running either nvidia-docker ps or docker ps, see below for an example :

Note the PORTS section which is very helpful once you get containers up and running to see how they are mapped.

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 95e42817050c nvidia/digits "python -m digits" 25 seconds ago Up 24 seconds 0.0.0.0:5001->5000/tcp digits1 6b12a4107569 nvidia/digits "python -m digits" 16 hours ago Up 16 hours 0.0.0.0:5000->5000/tcp digits

Other nvidia-docker related blog posts

Topics

docker-facebook-share.png
Troubleshooting

Installing and Using DOCKER and NV-DOCKER on CentOS 7

May 5, 201710 min read

DOCKER-ENGINE is a containerization technology that allows you to create, develop and run applications. In this article we will focus primarily on the basic installation steps for DOCKER and NV-DOCKER, and the ability for DOCKER, working with NV-DOCKER (a wrapper that NVIDIA provides) to provide a stable platform for pulling docker images, which are used to create containers. Containers are 'instances' of an environment, which are created based on the docker image. The containers can be run once, or live on as persistent daemon processes - for which there will be examples of nvidia-docker below.

Installing and getting DOCKER and NV-DOCKER running in CentOS 7 is a straight forward process:

# Assumes CentOS 7 # Assumes NVIDIA Driver is installed as per requirements ( < 340.29 )
# Install DOCKER
sudo curl -fsSL https://get.docker.com/ | sh
# Start DOCKER
sudo systemctl start docker
# Add dockeruser, usermod change
sudo adduser dockeruser
usermod -aG docker dockeruser
# Install NV-DOCKER
# GET NVIDIA-DOCKER
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker-1.0.1-1.x86_64.rpm
# INSTALL
sudo rpm -i /tmp/nvidia-docker*.rpm
# Start NV-DOCKER Service
systemctl start nvidia-docker

After the steps above you should have a running Docker and NVIDIA-DOCKER services.

This can be checked via:

[username@host ~]# systemctl status docker ● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Active: active (running) since Thu 2017-03-23 20:59:01 PDT; 16h ago
..... truncated ....
[username@host ~]# systemctl status nvidia-docker
● nvidia-docker.service - NVIDIA Docker plugin
Loaded: loaded (/usr/lib/systemd/system/nvidia-docker.service; disabled; vendor preset: disabled)
Active: active (running) since Thu 2017-03-23 20:58:59 PDT; 17h ago
..... truncated ....

Using Docker/NVIDIA-DOCKER

Pull and Run your First Container

For a quick and dirty test, using NVIDIA GPUs in a container. This can be done from either sudo/root or su - dockeruser from the install instructions above.

You can run the following example :

# Instantiate a container from the nvidia-docker command. # Note, that nvidia-docker must be run when using any command with docker that involves "run" that you would like to use GPUs with. nvidia-docker is a wrapper that handles setting up the environment (container) in relation to GPUs, GPGPU, Etc. nvidia-docker run --rm nvidia/cuda nvidia-smi

Command Explanation:

  • nvidia-docker - the NVIDIA shim/wrapper that helps setup GPUs with DOCKER
  • run - tells nvidia-docker wrapper that you're going to start (instantiate) a container
    • Note that for any command that does not include 'run' in it, you can simply use docker, but if you use nvidia-docker the command gets passed through to docker (E.g docker images display the docker images on your system, nvidia-docker images would also execute and show the same info)
  • --rm - this tells DOCKER that after the command runs, the container should be stopped/removed
    • This is a very interesting feature/capability. If you think about it, an entire environment is being created, for nvidia-smi to run, and then the container is destroyed. It can be done repeatedly and is very simple and fast.
  • nvidia/cuda - this is the name of an image
    • Note that, the first time you run this command, DOCKER will go out and find an image with that name, and download the docker image from the hub.docker.com repository. This will only happen the first time. You could also run docker pull nvidia/cuda before hand to be verbose and separate the steps. This one-liner works though.
  • nvidia-smi - this is the command to be run on the container

You should get output that looks like the below:

Note that the Pull complete portions (The parts above the nvidia-smi output) are a one-time occurrence as the image is not on your system locally and is being fetched to launch the image into a container instance.

[user@host ~]# nvidia-docker run --rm nvidia/cuda nvidia-smi Using default tag: latest
latest: Pulling from nvidia/cuda
d54efb8db41d: Pull complete
f8b845f45a87: Pull complete
e8db7bf7c39f: Pull complete
9654c40e9079: Pull complete
6d9ef359eaaa: Pull complete
cdfa70f89c10: Pull complete
3208f69d3a8f: Downloading 151.3 MB/421.5 MB
eac0f0483475: Download complete
4580f9c5bac3: Verifying Checksum
6ee6617c19de: Downloading 109 MB/456.1 MB
Fri Mar 24 20:47:52 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48 Driver Version: 367.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 0000:03:00.0 On | N/A |
| 27% 34C P8 7W / 180W | 7725MiB / 8113MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+

Running a persistent container / NVIDIA DIGITS

The following below is meant to demonstrate pulling a DIGITS image and running it in daemon/persistent mode.

It should be noted that in order to use DIGITS you will need to provide its data via the -v command line switch when launching the docker container, which you utilize to map a mount point on the local machine to a mount point within the container, for example: -v /mnt/dataset:/data/dataset This would map /mnt/dataset on the host machine to /data/dataset in the container. When interacting with DIGITS you would be able to see this data when creating datasets, etc from the Web UI.

Running nvidia-docker

[user@host~]# NV_GPU=0,1 nvidia-docker run --name digits -d -p 5000:5000 nvidia/digits 6b12a4107569214a3177304ef2c9db0f333e266d0d766d2c8c02e5bbddd3d444 # This is the Instance ID launched from the nvidia-docker run command

Command Explanation:

  • NV_GPU=0,1
    • This is a method of assigning GPU resources to a container which is critical for leveraging DOCKER in a Multi GPU System. This passes GPU ID 0,1 from the host system to the container as resources. Note that if you passed GPU ID 2,3 for example, the container would still see the GPUs as ID 0,1 inside the container, with the PCI ID of 2,3 from the host system.
  • nvidia-docker - the NVIDIA shim/wrapper that helps setup GPUs with DOCKER
  • run - tells nvidia-docker wrapper that you're going to start (instantiate) a container
    • Note that for any command that does not include 'run' in it, you can simply use docker, but if you use nvidia-docker the command gets passed through to docker (E.g docker images display the docker images on your system, nvidia-docker images would also execute and show the same info)
  • --name digits
    • This names your container instance, you need a unique name for each instance created in this way. It adds another way for instances to be referenced by, the default method is an instance ID Hash
  • -d
    • Instructs DOCKER that this will be a daemonized/persistent container
  • -p 5000:5000
    • This is a way of port mapping. 5000 is being mapped to 5000 for the DIGITS webserver port.
    • If you run multiple containers/instances of DIGITS, for example, you could do -p 5001:5000 for the next container and you would be able to connect to it at the IP_ADDRESS:5001 location, and still connect to IP_ADDRESS:5000 of the other DIGITS container.
  • nvidia/digits
    • Which image we're launching

After running this command, you could connect to DIGITS at the URL of the host system, at port 5000. It would have access to GPU ID 0,1 as resources within the container and within DIGITS in that container. If, for example, this was a 4 GPU machine, you could run the following to create another container, based on that same image, but expose a different port so that the two containers don't conflict with each other, and specify different GPUs so the containers don't try and utilize the same GPGPU resources.

[user@host~]# NV_GPU=2,3 nvidia-docker run --name digits -d -p 5001:5000 nvidia/digits 95e42817050c3e6de88f61473692a71ac0ab0948fe873c06155b95b62dad5554 # Instance ID!

Now you would have another DIGITS instance on port 5001 that would be accessible from a web browser, and this DIGITS installation would have access to the GPU ID 2 and 3 from the host system.

Check Running nvidia-docker Containers

You can check your running containers/instances by running either nvidia-docker ps or docker ps, see below for an example :

Note the PORTS section which is very helpful once you get containers up and running to see how they are mapped.

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 95e42817050c nvidia/digits "python -m digits" 25 seconds ago Up 24 seconds 0.0.0.0:5001->5000/tcp digits1 6b12a4107569 nvidia/digits "python -m digits" 16 hours ago Up 16 hours 0.0.0.0:5000->5000/tcp digits

Other nvidia-docker related blog posts

Topics