Docker使用GPU全过程

 更新时间:2024年01月09日 15:16:25   作者:DripBoy  
这篇文章主要介绍了Docker使用GPU全过程,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教

一、docker使用宿主机硬件设备的三种方式

  • 使用--privileged=true选项,以特权模式开启容器
  • 使用--device选项
  • 使用容器卷挂载-v选项

二、docker使用gpu方式演变

docker使用宿主机的gpu设备,本质是把宿主机使用gpu时调用的设备文件全部挂载到docker上。

nvidia提供了三种方式的演变,如下是官网的一些介绍

来自 <Enabling GPUs in the Container Runtime Ecosystem | NVIDIA Technical Blog>

NVIDIA designed NVIDIA-Docker in 2016 to enable portability in Docker images that leverage NVIDIA GPUs. It allowed driver agnostic CUDA images and provided a Docker command line wrapper that mounted the user mode components of the driver and the GPU device files into the container at launch. Over the lifecycle of NVIDIA-Docker, we realized the architecture lacked flexibility for a few reasons: Tight integration with Docker did not allow support of other container technologies such as LXC, CRI-O, and other runtimes in the future We wanted to leverage other tools in the Docker ecosystem – e.g. Compose (for managing applications that are composed of multiple containers) Support GPUs as a first-class resource in orchestrators such as Kubernetes and Swarm Improve container runtime support for GPUs – esp. automatic detection of user-level NVIDIA driver libraries, NVIDIA kernel modules, device ordering, compatibility checks and GPU features such as graphics, video acceleration As a result, the redesigned NVIDIA-Docker moved the core runtime support for GPUs into a library called libnvidia-container. The library relies on Linux kernel primitives and is agnostic relative to the higher container runtime layers. This allows easy extension of GPU support into different container runtimes such as Docker, LXC and CRI-O. The library includes a command-line utility and also provides an API for integration into other runtimes in the future. The library, tools, and the layers we built to integrate into various runtimes are collectively called the NVIDIA Container Runtime. Since 2015, Docker has been donating key components of its container platform, starting with the Open Containers Initiative (OCI) specification and an implementation of the specification of a lightweight container runtime called runc. In late 2016, Docker also donated containerd, a daemon which manages the container lifecycle and wraps OCI/runc. The containerd daemon handles transfer of images, execution of containers (with runc), storage, and network management. It is designed to be embedded into larger systems such as Docker. More information on the project is available on the official site. Figure 1 shows how the libnvidia-container integrates into Docker, specifically at the runc layer. We use a custom OCI prestart hook called nvidia-container-runtime-hook to runc in order to enable GPU containers in Docker (more information about hooks can be found in the OCI runtime spec). The addition of the prestart hook to runc requires us to register a new OCI compatible runtime with Docker (using the –runtime option). At container creation time, the prestart hook checks whether the container is GPU-enabled (using environment variables) and uses the container runtime library to expose the NVIDIA GPUs to the container. Figure 1.Integration of NVIDIA Container Runtime with Docker

1、nvidia-docker

nvidia-docker是在docker的基础上做了一层封装

通过 nvidia-docker-plugin把硬件设备在docker的启动命令上添加必要的参数。

Ubuntu distributions 
# Install nvidia-docker and nvidia-docker-plugin 
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.0-rc/nvidia-docker_1.0.0.rc-1_amd64.deb 
sudo dpkg -i /tmp/nvidia-docker_1.0.0.rc-1_amd64.deb && rm /tmp/nvidia-docker*.deb # Test nvidia-smi 
nvidia-docker run --rm nvidia/cuda nvidia-smi 
 
Other distributions 
# Install nvidia-docker and nvidia-docker-plugin 
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.0-rc/nvidia-docker_1.0.0.rc_amd64.tar.xz 
sudo tar --strip-components=1 -C /usr/bin -xvf /tmp/nvidia-docker_1.0.0.rc_amd64.tar.xz && rm /tmp/nvidia-docker*.tar.xz 
# Run nvidia-docker-plugin 
sudo -b nohup nvidia-docker-plugin > /tmp/nvidia-docker.log 
# Test nvidia-smi 
nvidia-docker run --rm nvidia/cuda nvidia-smi 
 
Standalone install 
# Install nvidia-docker and nvidia-docker-plugin 
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.0-rc/nvidia-docker_1.0.0.rc_amd64.tar.xz 
sudo tar --strip-components=1 -C /usr/bin -xvf /tmp/nvidia-docker_1.0.0.rc_amd64.tar.xz && rm /tmp/nvidia-docker*.tar.xz 
# One-time setup 
sudo nvidia-docker volume setup 
# Test nvidia-smi 
nvidia-docker run --rm nvidia/cuda nvidia-smi

2、nvidia-docker2

sudo apt-get install nvidia-docker2 sudo apt-get install nvidia-container-runtime sudo dockerd --add-runtime=nvidia=/usr/bin/nvidia-container-runtime [...]

3、nvidia-container-toolkit

docker版本在19.03及以上后

nvidia-container-toolkit进行了进一步的封装,在参数里直接使用--gpus "device=0" 即可

总结

以上为个人经验,希望能给大家一个参考,也希望大家多多支持脚本之家。

相关文章

  • Docker可视化管理工具DockerUI的使用

    Docker可视化管理工具DockerUI的使用

    这篇文章主要介绍了Docker可视化管理工具DockerUI的使用,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习学习吧
    2020-12-12
  • Docker容器部署前端Vue服务(小白教程)

    Docker容器部署前端Vue服务(小白教程)

    本文主要介绍了Docker容器部署前端Vue服务,文中根据实例编码详细介绍的十分详尽,具有一定的参考价值,感兴趣的小伙伴们可以参考一下
    2022-03-03
  • 详解Docker镜像的基本操作方法

    详解Docker镜像的基本操作方法

    这篇文章主要介绍了Docker镜像的基本操作方法,主要包括获取镜像和运行镜像的相关知识,本文给大家介绍的非常详细,需要的朋友可以参考下
    2022-07-07
  • dockerDesktop使用教程

    dockerDesktop使用教程

    本文给大家分享docker Desktop使用,本文给大家介绍的非常详细,对大家的学习或工作具有一定的参考借鉴价值,需要的朋友参考下吧
    2023-11-11
  • docker容器启动后添加端口映射

    docker容器启动后添加端口映射

    这篇文章主要介绍了docker容器启动后添加端口映射,,小编觉得挺不错的,现在分享给大家,也给大家做个参考。一起跟随小编过来看看吧
    2018-06-06
  • Docker容器的导入导出操作教程

    Docker容器的导入导出操作教程

    这篇文章主要给大家介绍了关于Docker容器的导入导出操作的相关资料,文中通过示例代码介绍的非常详细,对大家学习或者使用Docker具有一定的参考学习价值,需要的朋友们下面来一起学习学习吧
    2019-09-09
  • docker windows10 共享目录挂载失败的解决方案

    docker windows10 共享目录挂载失败的解决方案

    这篇文章主要介绍了docker windows10 共享目录挂载失败的解决方案,具有很好的参考价值,希望对大家有所帮助。一起跟随小编过来看看吧
    2021-03-03
  • docker如何查看容器启动命令(已运行的容器)

    docker如何查看容器启动命令(已运行的容器)

    Docker是一个开源的应用容器引擎,让开发者可以打包他们的应用以及依赖包到一个可移植的容器中,然后发布到任何流行的Linux机器上,下面这篇文章主要给大家介绍了关于docker如何查看容器启动命令(已运行的容器)的相关资料,需要的朋友可以参考下
    2023-02-02
  • 在 Docker 容器中运行 PHPMyAdmin的详细步骤

    在 Docker 容器中运行 PHPMyAdmin的详细步骤

    Docker是一个开源的应用容器引擎,它能够实现应用部署的自动化。此外,容器是完全使用沙箱机制,容器之间的环境相互独立,不会相互干扰,接下来通过本文给大家介绍在 Docker 容器中运行 PHPMyAdmin的详细步骤,感兴趣的朋友一起看看吧
    2022-01-01
  • 详解Shell脚本控制docker容器启动顺序

    详解Shell脚本控制docker容器启动顺序

    这篇文章主要介绍了Shell脚本控制docker容器启动顺序的相关资料,本文给大家介绍的非常详细,对大家的学习或工作具有一定的参考借鉴价值,需要的朋友可以参考下
    2021-03-03

最新评论