direct-play-nice Manual

This manual is the operational guide for the direct_play_nice CLI.

Use it for:

installation and setup
first-run walkthroughs
full command/flag reference
Sonarr/Radarr automation
troubleshooting hardware, codec, and OCR issues

If you only need the quick install and a one-command example, use the project README.md.

Installation

Prebuilt binaries

Each GitHub release publishes platform archives and checksums built by cargo-dist:

direct_play_nice-aarch64-apple-darwin.tar.xz
direct_play_nice-x86_64-apple-darwin.tar.xz
direct_play_nice-x86_64-unknown-linux-gnu.tar.xz
direct_play_nice-x86_64-pc-windows-msvc.zip

The release also includes shell and PowerShell installers.

From crates.io

cargo install direct_play_nice

From source

git clone https://github.com/ns-mkusper/direct-play-nice.git
cd direct-play-nice
cargo build --release

Binary path:

target/release/direct_play_nice

Getting Started

First conversion

direct_play_nice input.mkv output.mp4

Set up a default config file

If you run this tool often, create a config once and keep day-to-day commands short.

By default, direct-play-nice reads:

$XDG_CONFIG_HOME/direct-play-nice/config.toml (when XDG_CONFIG_HOME is set)
~/.config/direct-play-nice/config.toml

Create the file:

mkdir -p ~/.config/direct-play-nice
cat > ~/.config/direct-play-nice/config.toml <<'EOF'
streaming_devices = "all"
video_quality = "match-source"
video_codec = "auto"
audio_quality = "192k"
hw_accel = "auto"
unsupported_video_policy = "ignore"
sub_mode = "auto"
ocr_default_language = "eng"
servarr_output_extension = "mp4"
servarr_output_suffix = ".fixed"

[plex]
refresh = false
EOF

Why these defaults are sane

streaming_devices = "all" keeps output compatible across all built-in device profiles.
video_quality = "match-source" avoids unnecessary downscaling by default.
video_codec = "auto" lets the tool pick the safest codec intersection.
audio_quality = "192k" is a practical bitrate for broad AAC compatibility.
hw_accel = "auto" uses hardware encoding when available and falls back to software when not.
unsupported_video_policy = "ignore" skips extra video streams that can break muxing in common container/player paths.
sub_mode = "auto" only OCRs bitmap subtitles when needed.
ocr_default_language = "eng" gives OCR a stable fallback language.
servarr_output_extension = "mp4" targets the most portable container for direct play.
servarr_output_suffix = ".fixed" makes replaced files easy to identify during rollout.

Override order

When the same option appears in multiple places, priority is:

CLI flags (highest)
--config <path>
DIRECT_PLAY_NICE_CONFIG=<path>
Default config location above

Device targeting

Use --device to narrow compatibility constraints:

direct_play_nice --device chromecast input.mkv output.mp4
direct_play_nice --device chromecast,roku input.mkv output.mp4

--device all (or omitting --device) computes a profile compatible across all built-in device definitions.

Inspect an input before converting

direct_play_nice --probe-streams input.mkv
direct_play_nice --probe-streams --output json input.mkv

Command Reference

Synopsis

direct_play_nice [OPTIONS] [INPUT_FILE] [OUTPUT_FILE]

Positional arguments

[INPUT_FILE] video file to convert (required unless probing)
[OUTPUT_FILE] output media file (required unless probing)

Device and quality options

-d, --device <DEVICE> target family/model or all
--video-quality <video_quality> quality preset
--video-codec <video_codec> auto|h264|hevc
--audio-quality <audio_quality> quality preset
--max-video-bitrate <max_video_bitrate> explicit video cap
--max-audio-bitrate <max_audio_bitrate> explicit audio cap

Stream and compatibility controls

--unsupported-video-policy <unsupported_video_policy> convert|ignore|fail
--primary-video-stream-index <primary_video_stream_index>
--primary-video-criteria <primary_video_criteria> resolution|bitrate|fps
--skip-codec-check
--validate-output reopen the completed output and fail if expected A/V codecs or stream hygiene checks do not pass

Hardware controls

--hw-accel <hw_accel> auto|none|nvenc|vaapi|qsv|videotoolbox|amf

Probe modes

--probe-streams
--probe-hw
--probe-codecs
--probe-ocr-fixtures <PATH> evaluate OCR accuracy against fixture PNG+JSON pairs
--only-video
--only-hw
--probe-json
--output <OUTPUT> text|json
--streams-filter <STREAMS_FILTER> all|video|audio|subtitle

Sonarr/Radarr options

--servarr-output-extension <EXTENSION> (match-input supported)
--servarr-output-suffix <servarr_output_suffix>
--delete-source [<BOOL>]

Subtitle OCR options

--sub-mode <sub_mode> auto|force|skip
--subtitle-failure-policy <subtitle_failure_policy> skip-stream|fail
--ocr-default-language <ocr_default_language>
--ocr-engine <ocr_engine> auto|tesseract|pp-ocr-v3|pp-ocr-v4|external
--ocr-format <ocr_format> srt|ass
--ocr-write-srt-sidecar

Failure policy

Extra video streams follow --unsupported-video-policy: ignore drops them, fail aborts, and convert attempts to include them when the output container supports it.
Attachments, data streams, and attached pictures are treated as metadata and skipped for direct-play outputs.
Bitmap subtitles are handled by the OCR side pass unless --sub-mode=skip; text subtitles are converted to MP4-compatible timed text when included.
Subtitle decode/encode timestamp failures follow --subtitle-failure-policy: skip-stream warns and disables only that subtitle stream, while fail aborts conversion.
Audio conversion setup is fail-fast: if FFmpeg cannot initialize the required resampler, conversion aborts rather than writing suspect audio.
Hardware encoder/profile failures may retry with a safer software encoder path; decoder bitstream failures remain hard failures.

Plex options

--plex-refresh
--plex-url <PLEX_URL>
--plex-token <PLEX_TOKEN>

Full generated help

For the exact, version-specific clap output:

direct_play_nice --help

Architecture

direct-play-nice converts media by separating policy decisions from FFmpeg’s packet loop.

Conversion Model

Runtime configuration is resolved from CLI arguments and optional config.
Device profiles are intersected to choose a target container, video codec, audio codec, resolution limits, bitrate limits, and H.264 constraints.
The input is probed for direct-play compatibility. FFmpeg’s detected demuxer is the primary container signal; filename extension is used only to disambiguate MOV-family containers or as a fallback.
Conversion creates an explicit input-to-output stream map. Input stream indexes are used only for demuxed packets; output stream indexes are used only for encoded packets and muxer metadata.
Video, audio, and subtitle streams are decoded, transformed, encoded, and muxed through separate stream-processing paths.
Optional OCR post-processing handles bitmap subtitles and remuxes generated text subtitles into the final output.
Post-write verification checks H.264 constraints, and --validate-output can reopen the final media file to verify expected stream codecs and stream hygiene.

FFmpeg Boundaries

Most FFmpeg operations use rsmpeg wrappers. Raw pointer access is isolated in small helpers where possible:

ffmpeg_ext contains metadata reads, packet allocation, buffer unref, and other narrow unsafe operations.
timestamp contains shared timestamp selection, rescaling, and monotonic DTS adjustment.
pipeline_streams owns per-packet video/audio/subtitle processing.
pipeline_codec owns encoder setup and rate/profile options.
pipeline_assessment owns direct-play compatibility explanations.

Playable A/V streams are preserved by conversion. Attachments, data streams, and attached pictures are metadata for direct-play targets and are skipped. Extra video streams are governed by --unsupported-video-policy. Text subtitles are converted when included; bitmap subtitles are deferred to OCR unless subtitle processing is skipped.

Failures are intentionally policy-driven. Audio setup failures abort because they would risk invalid audio. Subtitle stream failures follow --subtitle-failure-policy: the default isolates the bad subtitle stream so A/V conversion can still complete, while strict mode aborts. Hardware encoder failures may retry with software when a safe fallback is available.

Quality Controls

By default, the CLI preserves source quality (match-source) for both video and audio.

Video quality presets

--video-quality supports:

match-source
360p
480p
720p
1080p
1440p
2160p

These presets apply resolution caps and target bitrate ranges suitable for common direct-play scenarios.

Audio quality presets

--audio-quality supports:

match-source
320k
256k
224k
192k
160k
128k
96k

Custom bitrate overrides

Use these for explicit control:

--max-video-bitrate <RATE> (for example 4800k, 6M, 12.5mbps)
--max-audio-bitrate <RATE>

When overrides are provided, they constrain the selected quality profile.

Subtitle OCR

Bitmap subtitle formats (PGS/VobSub/DVD) are not directly compatible with MP4 Direct Play in many client stacks. direct_play_nice can OCR bitmap subtitles into text tracks using AI OCR backends (PP-OCR/Tesseract). This path is meant for bitmap subtitle streams; text subtitles are muxed directly when compatible.

For official GPU architecture/provider references and compatibility links, see Hardware Acceleration.

Defaults

--sub-mode auto
--ocr-engine auto
--ocr-format srt

Common overrides

--sub-mode skip disable subtitle processing
--sub-mode force force subtitle processing
--ocr-engine pp-ocr-v4 force PP-OCR v4 pipeline
--ocr-engine pp-ocr-v3 fallback for older GPU/runtime combinations
--ocr-format ass request ASS (may be downgraded in MP4)
--ocr-write-srt-sidecar write .srt sidecars in addition to embedded output

GPU behavior

The OCR runtime attempts provider fallback when available (for example CUDA, DirectML, CoreML, then CPU). You can force behavior with:

DPN_OCR_REQUIRE_GPU=1
DPN_OCR_FORCE_CPU=1

ONNX engines:

--ocr-engine pp-ocr-v4 for modern GPU/runtime stacks
--ocr-engine pp-ocr-v3 for legacy/older GPU compatibility cases

Linux runtime notes:

Ensure CUDA/cuDNN and ONNX Runtime are version-compatible.
ORT_DYLIB_PATH=/path/to/libonnxruntime.so can be used if ONNX Runtime is not discoverable on default library paths.
For older NVIDIA stacks, --ocr-engine pp-ocr-v3 can be more stable than pp-ocr-v4.
Use scripts/ocr-tools/check_gpu_env.sh to inspect runtime/library setup.
Containerized workloads may need NVIDIA Container Toolkit and exposed runtime libraries.

Model location

Models are downloaded to a default model directory unless DPN_OCR_MODEL_DIR is set.

Default model filenames:

v4: ch_PP-OCRv4_det_infer.onnx, ch_ppocr_mobile_v2.0_cls_infer.onnx, en_PP-OCRv4_rec_infer.onnx
v3: ch_PP-OCRv3_det_infer.onnx, ch_ppocr_mobile_v2.0_cls_train.onnx, en_PP-OCRv3_rec_infer.onnx

Optional profile rec models are also auto-provisioned (downloaded on first use if missing in the model directory):

latin_PP-OCRv3_rec_mobile.onnx
japan_PP-OCRv4_rec_mobile.onnx
korean_PP-OCRv4_rec_mobile.onnx
chinese_cht_PP-OCRv3_rec_mobile.onnx

Override paths for these optional profiles with:

DPN_OCR_REC_LATIN_MODEL
DPN_OCR_REC_MULTILINGUAL_MODEL
DPN_OCR_REC_JAPANESE_MODEL
DPN_OCR_REC_KOREAN_MODEL
DPN_OCR_REC_CJK_MODEL

DPN_OCR_REC_MULTILINGUAL_MODEL is local-first: if unset, OCR auto-detects a compatible multilingual recognizer already present in the model directory (for example multilingual_PP-OCRv4_rec_infer.onnx) and uses it when script routing targets multilingual coverage. Unlike latin/japanese/korean/cjk profiles, this profile is not downloaded automatically.

Override recognition profile routing (language -> profile) with:

DPN_OCR_REC_PROFILE_OVERRIDES Example: spa=latin,rus=multilingual,sr-Latn=latin Script tags are also recognized automatically (for example zh-Hant, sr-Cyrl, sr-Latn).
DPN_OCR_LANGUAGE_SCRIPT_HINTS Example: rus=Cyrl,ara=Arab,srp=Cyrl
DPN_OCR_ROUTING_MANIFEST Path to custom TOML routing manifest (default: config/ocr-routing.toml in the repo source tree).

Config-file example

sub_mode = "auto"           # auto | force | skip
ocr_default_language = "eng"
ocr_engine = "auto"         # auto | tesseract | pp-ocr-v3 | pp-ocr-v4 | external
ocr_format = "srt"          # srt | ass
ocr_write_srt_sidecar = false
ocr_external_command = "python3 /opt/ocr/run.py"

Hardware Acceleration

This chapter covers GPU acceleration in direct_play_nice for:

H.264/HEVC hardware transcoding via FFmpeg encoders
AI OCR for bitmap subtitle streams (PGS/VobSub/DVD)

Transcoding acceleration

direct_play_nice hardware encoder selection is currently targeted at H.264 and HEVC output.

Codec and hardware mapping implemented by the CLI

H.264 hardware encoders: h264_nvenc, h264_qsv, h264_vaapi, h264_videotoolbox, h264_amf¹ ² ³ ⁴ ⁵ ⁶
HEVC hardware encoders: hevc_nvenc, hevc_qsv, hevc_vaapi, hevc_videotoolbox, hevc_amf¹ ² ³ ⁴ ⁵ ⁶
Backend availability is OS/build dependent and discovered at runtime⁷ ⁸ ⁶ ⁵

You can inspect your current host/build support with:

direct_play_nice --probe-hw --probe-codecs --only-video --only-hw --probe-json

Transcoding performance and validation artifacts

NVENC end-to-end matrix test validates profile/level/bitrate/device behavior:
- NVENC matrix test
NVENC regression tests:
- Profile/level integration test
- Duration-preservation regression

OCR acceleration (bitmap subtitles)

direct_play_nice uses ONNX Runtime providers for PP-OCR and has explicit legacy-NVIDIA logic in auto mode.

What is supported in this project

NVIDIA CUDA path for PP-OCRv3/PP-OCRv4 (primary validated path)
Legacy NVIDIA behavior: if nvidia-smi reports compute capability major <= 5 (Maxwell-class and older), --ocr-engine auto prefers pp-ocr-v3 and disables classifier for stability
Windows DirectML and Apple CoreML provider paths are wired and can be used when runtimes are installed⁹ ¹⁰ ¹¹
CPU fallback is available (or forced with DPN_OCR_FORCE_CPU=1)

OCR workload guidance by hardware class

Older NVIDIA families (Maxwell/Pascal-era systems): prefer --ocr-engine pp-ocr-v3¹²
Newer NVIDIA families (Turing/Ampere/Ada): start with --ocr-engine pp-ocr-v4
Non-NVIDIA GPUs: use auto and verify provider availability with probe logs; if providers are unavailable, OCR falls back to CPU/Tesseract path

OCR performance and validation artifacts

Full-movie OCR benchmark (self-hosted Linux, PP-OCRv3 GPU-required profile): 87.62 FPS, 3.65x realtime
- OCR benchmark report
OCR AI/GPU paths are covered in integration tests:
- AI OCR stream coverage test
- Multilingual OCR accuracy/perf stress

NVIDIA FFmpeg acceleration guide: https://docs.nvidia.com/video-technologies/video-codec-sdk/13.0/ffmpeg-with-nvidia-gpu/index.html. ↩ ↩2
NVIDIA NVENC programming guide: https://docs.nvidia.com/video-technologies/video-codec-sdk/13.0/nvenc-video-encoder-api-prog-guide/index.html. ↩ ↩2
FFmpeg h264_qsv / hevc_qsv encoder options: https://ffmpeg.org/ffmpeg-codecs.html#QSV-Encoders. ↩ ↩2
FFmpeg h264_vaapi / hevc_vaapi encoder options: https://ffmpeg.org/ffmpeg-codecs.html#VAAPI-encoders. ↩ ↩2
Apple VideoToolbox framework: https://developer.apple.com/documentation/videotoolbox. ↩ ↩2 ↩3
AMD AMF SDK: https://github.com/GPUOpen-LibrariesAndSDKs/AMF. ↩ ↩2 ↩3
NVIDIA Video encode/decode support matrix: https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new. ↩
Intel oneVPL supported hardware: https://www.intel.com/content/www/us/en/docs/onevpl/upgrade-from-msdk/2021-3/supported-hardware.html. ↩
ONNX Runtime CUDA execution provider: https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html. ↩
ONNX Runtime DirectML execution provider: https://onnxruntime.ai/docs/execution-providers/DirectML-ExecutionProvider.html. ↩
ONNX Runtime CoreML execution provider: https://onnxruntime.ai/docs/execution-providers/CoreML-ExecutionProvider.html. ↩
NVIDIA CUDA GPU compute capability list: https://developer.nvidia.com/cuda-gpus/. ↩

Plex Refresh

To avoid the “Plex dance” (manual library rescans after conversion), direct_play_nice can trigger a targeted Plex refresh automatically.

CLI options

--plex-refresh
--plex-url <PLEX_URL> (default: http://127.0.0.1:32400)
--plex-token <PLEX_TOKEN>

Environment variable equivalents:

DIRECT_PLAY_NICE_PLEX_REFRESH=true
DIRECT_PLAY_NICE_PLEX_URL=http://127.0.0.1:32400
DIRECT_PLAY_NICE_PLEX_TOKEN=... (or PLEX_TOKEN)

Example:

direct_play_nice \
  --plex-refresh \
  --plex-url http://127.0.0.1:32400 \
  --plex-token "$PLEX_TOKEN" \
  input.mkv output.mp4

Config file equivalent

[plex]
refresh = true
url = "http://127.0.0.1:32400"
token = "YOUR_TOKEN"

Need a token? See Plex support: https://support.plex.tv/articles/204059436-finding-an-authentication-token-x-plex-token/

Sonarr/Radarr Integration

direct-play-nice can be wired as a custom script in Sonarr/Radarr pipelines.

High-level flow

Arr service imports media.
Custom script invokes direct_play_nice.
Successful conversion output replaces source according to configured behavior.

Event behavior

The binary auto-detects Sonarr/Radarr custom-script invocations:

sonarr_eventtype=Download and radarr_eventtype=Download trigger conversion.
Non-download events (for example Test, Grab, Rename) exit cleanly.

Naming and replacement notes

Use --servarr-output-extension and --servarr-output-suffix to control output naming.
--delete-source applies to direct CLI usage.
In Sonarr/Radarr mode, replacement/rollback logic is handled by integration flow.
--servarr-output-extension match-input keeps the source container.

Example command in Sonarr custom script:

/path/to/direct_play_nice --config-file /path/to/direct-play-nice-sonarr.toml

Practical wrapper pattern

For GPU OCR environments, keep a stable wrapper script as the command Sonarr or Radarr calls. This keeps runtime paths and OCR flags centralized.

Example wrapper:

#!/usr/bin/env bash
set -euo pipefail

export ORT_DYLIB_PATH=\"/opt/onnxruntime/lib/libonnxruntime.so\"
export LD_LIBRARY_PATH=\"/opt/onnxruntime/lib:${LD_LIBRARY_PATH:-}\"

exec /path/to/direct_play_nice --config-file /path/to/direct-play-nice-sonarr.toml

This avoids drift between manual shell runs and Arr-triggered runs.

For service-specific script setup, see the Servarr docs:

Probe and Debug

Hardware probe

direct_play_nice --probe-hw
direct_play_nice --probe-hw --probe-json

Codec inventory

direct_play_nice --probe-codecs --only-video
direct_play_nice --probe-codecs --only-video --only-hw --probe-json

Input stream probe

direct_play_nice --probe-streams input.mkv
direct_play_nice --probe-streams --output json input.mkv

Concurrency controls

DIRECT_PLAY_NICE_MAX_JOBS
DIRECT_PLAY_NICE_JOBS_PER_GPU

Use these to tune parallel conversion throughput per machine.

For GPU architecture guidance and official vendor/runtime references (ONNX, CUDA, DirectML, Video Codec SDK, oneVPL, AMF, VideoToolbox), see Hardware Acceleration.

OCR runtime diagnostics (Linux)

verify ONNX Runtime linkage if OCR provider loading fails
set ORT_DYLIB_PATH when libonnxruntime.so is in a non-standard location
set DPN_OCR_REQUIRE_GPU=1 for fail-fast behavior when GPU OCR is mandatory

Troubleshooting

`--probe-hw` shows no usable hardware encoders

confirm FFmpeg build includes your target encoder (h264_nvenc, h264_qsv, etc.)
run --probe-codecs --only-video --only-hw
set --hw-accel none as a fallback path while debugging

OCR falls back to CPU unexpectedly

verify runtime libraries for your platform/provider are installed
use DPN_OCR_REQUIRE_GPU=1 to fail fast instead of silently falling back
try --ocr-engine pp-ocr-v3 on older GPUs/runtime stacks

Output not directly playable on a target client

verify target selection (--device)
inspect stream details with --probe-streams
set explicit bitrate/quality limits to match endpoint constraints
compare against SUPPORTED_DEVICES.md

Build and Test

Build from source

cargo install cargo-vcpkg
cargo vcpkg build
cargo build

If your vcpkg checkout is in a non-default location, set VCPKG_ROOT.

Run tests

cargo test

Run integration tests requiring ffmpeg CLI:

VCPKG_ROOT=/opt/vcpkg cargo test --features ffmpeg-cli-tests

Optional NVENC regression suite

ENABLE_NVENC_TESTS=1 cargo test nvenc_matrix -- --test-threads=1

Optional direnv setup

export VCPKG_ROOT=/opt/vcpkg
export RUST_LOG=${RUST_LOG:-WARN}

Quality gates

Run the same strict checks used in CI before opening a PR:

cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --document-private-items
cargo test --no-run

Release Process

Use this process when cutting a new source and binary release.

Prepare the Release PR

Pick the next version.

Prerelease versions use SemVer suffixes such as 0.1.0-beta.3, 0.1.0-alpha.12, or 0.1.0-rc.1. GitHub and crates.io treat those as prereleases. Stable releases do not have a suffix, for example 0.1.0.
Update Cargo.toml.
```
version = "0.1.0-beta.3"
```
Add a concrete section to CHANGELOG.md.

Do not leave release notes only under [Unreleased]. The binary release workflow extracts the section that matches the tag.
```
## [0.1.0-beta.3] - 2026-04-29

### Highlights

- Added or changed something user-visible.
- Fixed something release-worthy.
```
Open a PR with the version bump and changelog entry.

The release-readiness workflow checks whether release metadata will be updated on merge and verifies that CHANGELOG.md already contains concrete notes for the exact Cargo.toml version.

Merge and Verify

Merge the release PR to main.
Wait for the merge pipelines to pass:
- Continuous Deployment
- Benchmarks (Post-Merge)

Confirm the release tag exists.

git fetch --tags origin
git rev-parse v0.1.0-beta.3

Publish or Rerun Binaries

Use the Release workflow when binaries need to be built or repaired.

Open GitHub Actions.
Run the Release workflow manually.
Enter the exact tag, including the leading v.
```
v0.1.0-beta.3
```
Wait for all release jobs to pass.

Expected platform archives:
- direct_play_nice-aarch64-apple-darwin.tar.xz
- direct_play_nice-x86_64-apple-darwin.tar.xz
- direct_play_nice-x86_64-unknown-linux-gnu.tar.xz
- direct_play_nice-x86_64-pc-windows-msvc.zip
The workflow also publishes checksums, installers, dist-manifest.json, and the source archive.

Verify the Published Release

Check the GitHub release:

gh release view v0.1.0-beta.3 --json tagName,isDraft,isPrerelease,publishedAt,assets

Verify:

isDraft is false.
isPrerelease matches the version suffix.
The release body contains the matching CHANGELOG.md section.
All expected binary archives and checksum files are present.

Check crates.io:

curl -sS https://crates.io/api/v1/crates/direct_play_nice/0.1.0-beta.3

Notes

Manual Release workflow reruns update an existing GitHub release and replace assets with the same names.
The workflow resolves the GitHub release target from the requested tag, not from the branch used to dispatch the workflow.
Release-workflow benchmarks are skipped for manual binary reruns. The post-merge benchmark pipeline remains the release gate.

Keyboard shortcuts

direct-play-nice Manual