230 lines
8.2 KiB
Markdown
230 lines
8.2 KiB
Markdown
|
<!---
|
||
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||
|
you may not use this file except in compliance with the License.
|
||
|
You may obtain a copy of the License at
|
||
|
|
||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||
|
|
||
|
Unless required by applicable law or agreed to in writing, software
|
||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||
|
See the License for the specific language governing permissions and
|
||
|
limitations under the License. See accompanying LICENSE file.
|
||
|
-->
|
||
|
|
||
|
|
||
|
# Using GPU On YARN
|
||
|
# Prerequisites
|
||
|
|
||
|
- As of now, only Nvidia GPUs are supported by YARN
|
||
|
- YARN node managers have to be pre-installed with Nvidia drivers.
|
||
|
- When Docker is used as container runtime context, nvidia-docker 1.0 needs to be installed (Current supported version in YARN for nvidia-docker).
|
||
|
|
||
|
# Configs
|
||
|
|
||
|
## GPU scheduling
|
||
|
|
||
|
In `resource-types.xml`
|
||
|
|
||
|
Add following properties
|
||
|
|
||
|
```
|
||
|
<configuration>
|
||
|
<property>
|
||
|
<name>yarn.resource-types</name>
|
||
|
<value>yarn.io/gpu</value>
|
||
|
</property>
|
||
|
</configuration>
|
||
|
```
|
||
|
|
||
|
In `yarn-site.xml`
|
||
|
|
||
|
`DominantResourceCalculator` MUST be configured to enable GPU scheduling/isolation.
|
||
|
|
||
|
For `Capacity Scheduler`, use following property to configure `DominantResourceCalculator` (In `capacity-scheduler.xml`):
|
||
|
|
||
|
| Property | Default value |
|
||
|
| --- | --- |
|
||
|
| yarn.scheduler.capacity.resource-calculator | org.apache.hadoop.yarn.util.resource.DominantResourceCalculator |
|
||
|
|
||
|
|
||
|
## GPU Isolation
|
||
|
|
||
|
### In `yarn-site.xml`
|
||
|
|
||
|
```
|
||
|
<property>
|
||
|
<name>yarn.nodemanager.resource-plugins</name>
|
||
|
<value>yarn.io/gpu</value>
|
||
|
</property>
|
||
|
```
|
||
|
|
||
|
This is to enable GPU isolation module on NodeManager side.
|
||
|
|
||
|
By default, YARN will automatically detect and config GPUs when above config is set. Following configs need to be set in `yarn-site.xml` only if admin has specialized requirements.
|
||
|
|
||
|
**1) Allowed GPU Devices**
|
||
|
|
||
|
| Property | Default value |
|
||
|
| --- | --- |
|
||
|
| yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices | auto |
|
||
|
|
||
|
Specify GPU devices which can be managed by YARN NodeManager (split by comma).
|
||
|
Number of GPU devices will be reported to RM to make scheduling decisions.
|
||
|
Set to auto (default) let YARN automatically discover GPU resource from
|
||
|
system.
|
||
|
|
||
|
Manually specify GPU devices if auto detect GPU device failed or admin
|
||
|
only want subset of GPU devices managed by YARN. GPU device is identified
|
||
|
by their minor device number and index. A common approach to get minor
|
||
|
device number of GPUs is using `nvidia-smi -q` and search `Minor Number`
|
||
|
output.
|
||
|
|
||
|
When minor numbers are specified manually, admin needs to include indice of GPUs
|
||
|
as well, format is `index:minor_number[,index:minor_number...]`. An example
|
||
|
of manual specification is `0:0,1:1,2:2,3:4"`to allow YARN NodeManager to
|
||
|
manage GPU devices with indices `0/1/2/3` and minor number `0/1/2/4`.
|
||
|
numbers .
|
||
|
|
||
|
**2) Executable to discover GPUs**
|
||
|
|
||
|
| Property | value |
|
||
|
| --- | --- |
|
||
|
| yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables | /absolute/path/to/nvidia-smi |
|
||
|
|
||
|
When `yarn.nodemanager.resource.gpu.allowed-gpu-devices=auto` specified,
|
||
|
YARN NodeManager needs to run GPU discovery binary (now only support
|
||
|
`nvidia-smi`) to get GPU-related information.
|
||
|
When value is empty (default), YARN NodeManager will try to locate
|
||
|
discovery executable itself.
|
||
|
An example of the config value is: `/usr/local/bin/nvidia-smi`
|
||
|
|
||
|
**3) Docker Plugin Related Configs**
|
||
|
|
||
|
Following configs can be customized when user needs to run GPU applications inside Docker container. They're not required if admin follows default installation/configuration of `nvidia-docker`.
|
||
|
|
||
|
| Property | Default value |
|
||
|
| --- | --- |
|
||
|
| yarn.nodemanager.resource-plugins.gpu.docker-plugin | nvidia-docker-v1 |
|
||
|
|
||
|
Specify docker command plugin for GPU. By default uses Nvidia docker V1.0.
|
||
|
|
||
|
| Property | Default value |
|
||
|
| --- | --- |
|
||
|
| yarn.nodemanager.resource-plugins.gpu.docker-plugin.nvidia-docker-v1.endpoint | http://localhost:3476/v1.0/docker/cli |
|
||
|
|
||
|
Specify end point of `nvidia-docker-plugin`. Please find documentation: https://github.com/NVIDIA/nvidia-docker/wiki For more details.
|
||
|
|
||
|
**4) CGroups mount**
|
||
|
|
||
|
GPU isolation uses CGroup [devices controller](https://www.kernel.org/doc/Documentation/cgroup-v1/devices.txt) to do per-GPU device isolation. Following configs should be added to `yarn-site.xml` to automatically mount CGroup sub devices, otherwise admin has to manually create devices subfolder in order to use this feature.
|
||
|
|
||
|
| Property | Default value |
|
||
|
| --- | --- |
|
||
|
| yarn.nodemanager.linux-container-executor.cgroups.mount | true |
|
||
|
|
||
|
|
||
|
### In `container-executor.cfg`
|
||
|
|
||
|
In general, following config needs to be added to `container-executor.cfg`
|
||
|
|
||
|
```
|
||
|
[gpu]
|
||
|
module.enabled=true
|
||
|
```
|
||
|
|
||
|
When user needs to run GPU applications under non-Docker environment:
|
||
|
|
||
|
```
|
||
|
[cgroups]
|
||
|
# This should be same as yarn.nodemanager.linux-container-executor.cgroups.mount-path inside yarn-site.xml
|
||
|
root=/sys/fs/cgroup
|
||
|
# This should be same as yarn.nodemanager.linux-container-executor.cgroups.hierarchy inside yarn-site.xml
|
||
|
yarn-hierarchy=yarn
|
||
|
```
|
||
|
|
||
|
When user needs to run GPU applications under Docker environment:
|
||
|
|
||
|
**1) Add GPU related devices to docker section:**
|
||
|
|
||
|
Values separated by comma, you can get this by running `ls /dev/nvidia*`
|
||
|
|
||
|
```
|
||
|
[docker]
|
||
|
docker.allowed.devices=/dev/nvidiactl,/dev/nvidia-uvm,/dev/nvidia-uvm-tools,/dev/nvidia1,/dev/nvidia0
|
||
|
```
|
||
|
|
||
|
**2) Add `nvidia-docker` to volume-driver whitelist.**
|
||
|
|
||
|
```
|
||
|
[docker]
|
||
|
...
|
||
|
docker.allowed.volume-drivers
|
||
|
```
|
||
|
|
||
|
**3) Add `nvidia_driver_<version>` to readonly mounts whitelist.**
|
||
|
|
||
|
```
|
||
|
[docker]
|
||
|
...
|
||
|
docker.allowed.ro-mounts=nvidia_driver_375.66
|
||
|
```
|
||
|
|
||
|
# Use it
|
||
|
|
||
|
## Distributed-shell + GPU
|
||
|
|
||
|
Distributed shell currently support specify additional resource types other than memory and vcores.
|
||
|
|
||
|
### Distributed-shell + GPU without Docker
|
||
|
|
||
|
Run distributed shell without using docker container (Asks 2 tasks, each task has 3GB memory, 1 vcore, 2 GPU device resource):
|
||
|
|
||
|
```
|
||
|
yarn jar <path/to/hadoop-yarn-applications-distributedshell.jar> \
|
||
|
-jar <path/to/hadoop-yarn-applications-distributedshell.jar> \
|
||
|
-shell_command /usr/local/nvidia/bin/nvidia-smi \
|
||
|
-container_resources memory-mb=3072,vcores=1,yarn.io/gpu=2 \
|
||
|
-num_containers 2
|
||
|
```
|
||
|
|
||
|
You should be able to see output like
|
||
|
|
||
|
```
|
||
|
Tue Dec 5 22:21:47 2017
|
||
|
+-----------------------------------------------------------------------------+
|
||
|
| NVIDIA-SMI 375.66 Driver Version: 375.66 |
|
||
|
|-------------------------------+----------------------+----------------------+
|
||
|
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
|
||
|
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|
||
|
|===============================+======================+======================|
|
||
|
| 0 Tesla P100-PCIE... Off | 0000:04:00.0 Off | 0 |
|
||
|
| N/A 30C P0 24W / 250W | 0MiB / 12193MiB | 0% Default |
|
||
|
+-------------------------------+----------------------+----------------------+
|
||
|
| 1 Tesla P100-PCIE... Off | 0000:82:00.0 Off | 0 |
|
||
|
| N/A 34C P0 25W / 250W | 0MiB / 12193MiB | 0% Default |
|
||
|
+-------------------------------+----------------------+----------------------+
|
||
|
|
||
|
+-----------------------------------------------------------------------------+
|
||
|
| Processes: GPU Memory |
|
||
|
| GPU PID Type Process name Usage |
|
||
|
|=============================================================================|
|
||
|
| No running processes found |
|
||
|
+-----------------------------------------------------------------------------+
|
||
|
```
|
||
|
|
||
|
For launched container task.
|
||
|
|
||
|
### Distributed-shell + GPU with Docker
|
||
|
|
||
|
You can also run distributed shell with Docker container. `YARN_CONTAINER_RUNTIME_TYPE`/`YARN_CONTAINER_RUNTIME_DOCKER_IMAGE` must be specified to use docker container.
|
||
|
|
||
|
```
|
||
|
yarn jar <path/to/hadoop-yarn-applications-distributedshell.jar> \
|
||
|
-jar <path/to/hadoop-yarn-applications-distributedshell.jar> \
|
||
|
-shell_env YARN_CONTAINER_RUNTIME_TYPE=docker \
|
||
|
-shell_env YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=<docker-image-name> \
|
||
|
-shell_command nvidia-smi \
|
||
|
-container_resources memory-mb=3072,vcores=1,yarn.io/gpu=2 \
|
||
|
-num_containers 2
|
||
|
```
|