Hardware Acceleration

This chapter covers GPU acceleration in direct_play_nice for:

Transcoding acceleration

direct_play_nice hardware encoder selection is currently targeted at H.264 and HEVC output.

H.264 hardware encoders: h264_nvenc, h264_qsv, h264_vaapi, h264_videotoolbox, h264_amf¹ ² ³ ⁴ ⁵ ⁶
HEVC hardware encoders: hevc_nvenc, hevc_qsv, hevc_vaapi, hevc_videotoolbox, hevc_amf¹ ² ³ ⁴ ⁵ ⁶
Backend availability is OS/build dependent and discovered at runtime⁷ ⁸ ⁶ ⁵

You can inspect your current host/build support with:

direct_play_nice --probe-hw --probe-codecs --only-video --only-hw --probe-json

NVENC end-to-end matrix test validates profile/level/bitrate/device behavior:
- NVENC matrix test
NVENC regression tests:
- Profile/level integration test
- Duration-preservation regression

direct_play_nice uses ONNX Runtime providers for PP-OCR and has explicit legacy-NVIDIA logic in auto mode.

NVIDIA CUDA path for PP-OCRv3/PP-OCRv4 (primary validated path)
Legacy NVIDIA behavior: if nvidia-smi reports compute capability major <= 5 (Maxwell-class and older), --ocr-engine auto prefers pp-ocr-v3 and disables classifier for stability
Windows DirectML and Apple CoreML provider paths are wired and can be used when runtimes are installed⁹ ¹⁰ ¹¹
CPU fallback is available (or forced with DPN_OCR_FORCE_CPU=1)

Older NVIDIA families (Maxwell/Pascal-era systems): prefer --ocr-engine pp-ocr-v3¹²
Newer NVIDIA families (Turing/Ampere/Ada): start with --ocr-engine pp-ocr-v4
Non-NVIDIA GPUs: use auto and verify provider availability with probe logs; if providers are unavailable, OCR falls back to CPU/Tesseract path

Full-movie OCR benchmark (self-hosted Linux, PP-OCRv3 GPU-required profile): 87.62 FPS, 3.65x realtime
- OCR benchmark report
OCR AI/GPU paths are covered in integration tests:
- AI OCR stream coverage test
- Multilingual OCR accuracy/perf stress