Feat: Automatic GPU Switch #845

Steel-skull · 2024-10-30T13:32:52Z

Docker Windows GPU Passthrough

[this is not fully tested as im waiting for a gpu to come in]

Automated GPU management solution for Windows in Docker containers with NVIDIA GPU passthrough support. This project provides scripts and configurations to dynamically manage GPU binding between host and Docker containers, with support for multiple GPUs and audio devices.

Prerequisites

Unraid server (or Linux system with Docker)
NVIDIA GPU(s)
Docker and Docker Compose
VFIO-PCI support in kernel
NVIDIA drivers installed on host

Quick Start

Clone the repository:

git clone https://github.com/yourusername/docker-windows-gpu.git
cd docker-windows-gpu

Configure your environment:

# Set to your GPU ID(s), PCI address(es), or 'none'
add NVIDIA_VISIBLE_DEVICES=0

Start the container:

docker-compose up -d

Configuration

Environment Variables

NVIDIA_VISIBLE_DEVICES: Specify GPU(s) to use
- Single GPU: NVIDIA_VISIBLE_DEVICES=0
- Multiple GPUs: NVIDIA_VISIBLE_DEVICES=0,1
- PCI addresses: NVIDIA_VISIBLE_DEVICES=0000:03:00.0,0000:04:00.0
- No GPU: NVIDIA_VISIBLE_DEVICES=none

Docker Compose

The provided docker-compose.yml includes all necessary configurations for:

GPU passthrough
RDP access
KVM support
Network management
Persistent storage

Usage

Manual GPU Management (until I find a way to run pre and post stop, use it with user scripts)

Bind GPU to container:

NVIDIA_VISIBLE_DEVICES=0 /boot/config/plugins/user.scripts/gpu-switch.sh start windows

Release GPU:

NVIDIA_VISIBLE_DEVICES=0 /boot/config/plugins/user.scripts/gpu-switch.sh stop windows

Script Details

The gpu-switch.sh script handles:

GPU detection and validation
Driver management (NVIDIA ⟷ VFIO-PCI)
Audio device pairing
Docker container configuration
Error handling and logging

gpu switch version: 0.1 # Without GPU: NVIDIA_VISIBLE_DEVICES="" ./gpu-switch.sh start container_name # With single GPU: NVIDIA_VISIBLE_DEVICES="0" ./gpu-switch.sh start container_name # With multiple GPUs: NVIDIA_VISIBLE_DEVICES="0,1" ./gpu-switch.sh start container_name # With PCI addresses: NVIDIA_VISIBLE_DEVICES="0000:03:00.0,0000:04:00.0" ./gpu-switch.sh start container_name # Explicitly disable GPU: NVIDIA_VISIBLE_DEVICES="none" ./gpu-switch.sh start container_name

Steel-skull · 2024-10-30T13:52:15Z

have to modify the docker compose side as I was under the impression it supported pre-start and post-stop scripts but I misread and its post-start and pre-stop, ill need to find a new way to work this, script still works and can be implemented using user scripts in unraid.

[again tho im waiting on a gpu so i haven't been able to fully test it]

kroese · 2024-11-09T23:16:29Z

Very interesting work!! Did you already receive your GPU to test it?

JosueIsrael-prog · 2024-11-11T10:21:16Z

Very good

maksymdor · 2024-11-11T10:45:50Z

Hmm! Interesting

vinkay215 · 2024-11-11T17:47:35Z

gpu-switch.sh

+if ! check_gpu_needed; then
+    log "Continuing without GPU management"
+    exit 0
+fi


Instead of listing all containers, you can directly check the existence of the container using docker container inspect, which is more efficient since it only checks the specified container without scanning the entire list. Here’s how to replace that line:

if ! docker container inspect "$CONTAINER_NAME" > /dev/null 2>&1; then error_exit "Container $CONTAINER_NAME does not exist" fi

The docker container inspect command returns an error if the container does not exist, so you can use it to directly verify the container’s existence without listing all containers.

vinkay215 · 2024-11-11T17:51:44Z

gpu-switch.sh

+}
+
+# Convert any GPU identifier to PCI address
+convert_to_pci_address() {


Incorporating these improvements, here’s the final optimized convert_to_pci_address fu

convert_to_pci_address() { local device="$1" local gpu_address="" if [[ "$device" =~ ^[0-9]+$ || "$device" =~ ^GPU-.*$ ]]; then # Convert GPU index or UUID to PCI address gpu_address=$(nvidia-smi --id="$device" --query-gpu=gpu_bus_id --format=csv,noheader 2>/dev/null | tr -d '[:space:]') else # Direct PCI address provided gpu_address="$device" fi # Check for valid output if [ -z "$gpu_address" ]; then error_exit "Failed to get PCI address for device: $device" fi # Standardize format echo "$gpu_address" | sed -e 's/0000://' -e 's/\./:/g' }

tl123987 · 2024-11-12T02:53:28Z

share failed? is there something wrong?

Steel-skull added 2 commits October 30, 2024 08:19

Update compose.yml

52758f4

Steel-skull mentioned this pull request Oct 30, 2024

GPU Passthrough #22

Open

Karinza38 approved these changes Nov 11, 2024

View reviewed changes

vinkay215 reviewed Nov 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Automatic GPU Switch #845

Feat: Automatic GPU Switch #845

Steel-skull commented Oct 30, 2024 •

edited

Loading

Steel-skull commented Oct 30, 2024

kroese commented Nov 9, 2024

JosueIsrael-prog commented Nov 11, 2024

maksymdor commented Nov 11, 2024

vinkay215 Nov 11, 2024 •

edited

Loading

vinkay215 Nov 11, 2024

tl123987 commented Nov 12, 2024 •

edited

Loading

Feat: Automatic GPU Switch #845

Are you sure you want to change the base?

Feat: Automatic GPU Switch #845

Conversation

Steel-skull commented Oct 30, 2024 • edited Loading

Docker Windows GPU Passthrough

Prerequisites

Quick Start

Configuration

Environment Variables

Docker Compose

Usage

Manual GPU Management (until I find a way to run pre and post stop, use it with user scripts)

Script Details

Steel-skull commented Oct 30, 2024

kroese commented Nov 9, 2024

JosueIsrael-prog commented Nov 11, 2024

maksymdor commented Nov 11, 2024

vinkay215 Nov 11, 2024 • edited Loading

Choose a reason for hiding this comment

vinkay215 Nov 11, 2024

Choose a reason for hiding this comment

tl123987 commented Nov 12, 2024 • edited Loading

Steel-skull commented Oct 30, 2024 •

edited

Loading

vinkay215 Nov 11, 2024 •

edited

Loading

tl123987 commented Nov 12, 2024 •

edited

Loading