This may be faster or slower than the existing specialized kernels,
so I opted not to prefer it by default. I also deliberately didn't expose
additional filter function capabilites yet.
The main motivating reason here is to get correct anti-aliasing behavior
when downscaling, which is currently completely broken.
Signed-off-by: Niklas Haas <git@haasn.dev>
Deprecated in commit 09c53a04c5
on 2022-06-11.
Thanks to Michael Niedermayer for pointing out that
the documentation needs to be updated, too.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The decision to switch to checking peer certificates by default
at the next major version bump was announced on 2025-08-09
in commit 5621eee672.
Thanks to Michael Niedermayer for pointing out that the documentation
needs to be updated, too.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
libnpp and the corresponding filters have been deprecated
in commit 994a368451
on 2025-09-26. By the time of our next release,
a year will have passed, so they are removed immediately.
Note: Passing --enable-libnpp to configure results in
a warning about the deprecation and is otherwise a no-op.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The shaping option is now exposed by both the ass and subtitles filters.
Document it under subtitles, update the ass filter description to refer
to the shared option set, and note that complex shaping is required for
Arabic, Hebrew, Devanagari and Thai and depends on a HarfBuzz-enabled
libass build.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
This patch adds ONNX Runtime as a new DNN backend for FFmpeg's dnn_processing
filter, enabling hardware-accelerated neural network inference on multiple
GPU and NPU platforms.
Execution Providers Supported:
- CPU execution provider (default)
- CUDA execution provider (NVIDIA GPUs)
- DirectML execution provider (AMD/Intel/NVIDIA GPUs on Windows)
- VitisAI execution provider (AMD Ryzen AI NPU)
The options for dnn_processing with dnn_backend=onnx:
- device: execution provider — cpu, cuda, dml, or vitisai (default: cpu)
- device_id: GPU device index (default: 0)
- threads_per_operation: inference thread count for CPU EP (default: 0, auto)
- input: input tensor name. When omitted the backend resolves it from loaded session
- output: output tensor name. When omitted the backend resolves it from loaded session
Example usage:
# CPU inference
ffmpeg -i input.mp4 -vf "format=rgb24,dnn_processing=dnn_backend=onnx:model=model.onnx:input=image_in:output=image_out" output.mp4
# CUDA GPU inference
ffmpeg -i input.mp4 -vf "dnn_processing=dnn_backend=onnx:model=model.onnx:device=cuda:device_id=0" output.mp4
# DirectML GPU inference (Windows)
ffmpeg -i input.mp4 -vf "dnn_processing=dnn_backend=onnx:model=model.onnx:device=dml:device_id=0" output.mp4
# VitisAI NPU inference
ffmpeg -i input.mp4 -vf "dnn_processing=dnn_backend=onnx:model=model.onnx:device=vitisai" output.mp4
Note: depending on the model, you may need a format filter (e.g. format=rgb24 or format=grayf32) before dnn_processing to convert the frames to the pixel format the model's input tensor expects.
Signed-off-by: younengxiao <steven.xiao@amd.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
The lrc muxer has a precision option controlling the number of
fractional digits written in each timestamp, but it was not documented.
Add it to the lrc section, including its range and default.
Signed-off-by: Bogdan Lisman <bogdan@pydevsolutions.com>
This value is matched to the typical seek latency in a reasonably capable
7200 rpm disk device, as well as the typical latency of an on-premise HTTP
request.
Note that this change should rarely have a significant effect, because
it only matters when using multiple concurrent processes, and one process
is somehow stuck in I/O (or died). Since we sleep in a loop for 1/16th of
the requested timeout value, this should only increase the effective read
latency by up to ~500 us on top of the actual underlying latency.
The alternative is hammering the same underlying resource with the exact
same requests at the exact same time (e.g. during init).
Sponsored-by: nxtedition AB
Signed-off-by: Niklas Haas <git@haasn.dev>
This will effectively disable the cache but allows the cache layer to verify
cached files against the original input file. Useful only for debugging
the shared cache protocol itself, as file corruption can already be caught by
the CRC check.
This adds a new protocol shared:URI which is distinct from the existing
`cache:` in that it is explicity designed to be thread-safe and cross-process,
enabling multiple ffmpeg processes (or multiple ffmpeg decoders within the same
process) to share a single cache file, for e.g. a remote HTTP stream. As such,
it uses a radically different internal design.
To facilitate zero-knowledge cross-process interoperability, the cache file
itself is just a memory-mapped representation of the underlying file data,
which has the side benefit that the resulting cache file will contain a
working copy of the streamed file (assuming the stream was read to
completion).
To keep track of which regions are cached and which are not, we use a
secondary file that contains a minimal header along with a static bytemap of
blocks within the file. This secondary file is also used to store metadata
such as the filesize, if known, as well as marking "failed" blocks.
Both files can grow dynamically in order to accommodate larger/growing files,
and can be atomically updated (through the use of shared space maps). I have
extensively checked the space map initalization and update code for race
conditions, and I believe the current design to be solid.
That said, it is the user's responsibility to some extent to ensure that the
same URI is not used for different streams, as we rely on the URI to uniquely
identify the cache files. That said, we use a cryptographic hash with
sufficient collision resistance to protect against possible abuse. The lack of
any implicit default on `-cache_dir` also means that `shared:` can't be enabled
via URL injection to possibly access random files on the disk (or intentionally
leak content from other streams with similar URIs, even if the cryptograhic
hash function is broken).
If the input is expected to grow, we shouldn't make any assumptions about
the file size. This matches e.g. the behavior of streamed protocols like
chunked HTTP, which similarly return ENOSYS for streams of unknown size.
Sponsored-by: nxtedition AB
Signed-off-by: Niklas Haas <git@haasn.dev>
This allows constraining the set of available backends. This serves as a
better replacement for the "unstable" flag, which is a bit ambiguous. Allows
users to, for example, opt into the memcpy or x86 backend, while excluding
e.g. the upcoming JIT backends.
Signed-off-by: Niklas Haas <git@haasn.dev>
These are needed for interop with e.g. libplacebo, which needs to know the
correct flags to call vkGetDeviceQueue2.
Signed-off-by: Niklas Haas <git@haasn.dev>
It's been replaced with AVStreamGroupLayeredVideo, which is functionally the
same while generic enough to be shared with other kinds of layered video
implementations.
Signed-off-by: James Almer <jamrial@gmail.com>
Carries a raw HEVCDecoderConfigurationRecord for the Dolby Vision
enhancement layer, parsed from the hvcE box (ISOM) or the corresponding
BlockAdditionMapping (Matroska).
libcelt, which it depends on, was not updated in a very long time and is
considered deprecated, as Opus exists which has a CELT mode. Therefore
remove standalone CELT decoding support.
It was already broken since b8604a9761,
11 years ago, and no one noticed and complained.
According to Chapter 3, Paragraph 2 of the "SI Brochure - 9th ed./version 3.02":
> Prefix symbols are printed in upright typeface, as are unit symbols,
> regardless of the typeface used in the surrounding text and are
> attached to unit symbols without a space between the prefix symbol
> and the unit symbol.
https://www.bipm.org/documents/20126/41483022/SI-Brochure-9-EN.pdf
av_program_add_stream_index() added in 526efa1053
may fail to carry out its purpose but the lack of
a return value stops callers from catching any error.
Fixed in new function.
This patch adds the transpose_cuda video filter.
It's similar to the existing transpose filter but accelerated by CUDA.
It supports the same pixel formats as the scale_cuda filter.
This also supersedes the deprecated transpose_npp filter.
Example usage:
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i <INPUT> -vf "transpose_cuda=dir=clock" <OUTPUT>
Signed-off-by: nyanmisaka <nst799610810@gmail.com>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
Add a top-level title and demote former section headings (MD041-style hierarchy).
Add blank lines around headings and fenced code blocks where appropriate (MD022 and MD031-style). Some Markdown parsers, including kramdown, only recognize headings that are preceded by a blank line.
This incorrectly lists the libavcodec major version as 60 instead of
62. Also fix the date and commit hash while at it
Fixes: 7faa6ee2aa ("libavformat/matroska: Support smpte 2094-50 metadata")
Signed-off-by: llyyr <llyyr.public@gmail.com>