Commit Graph

502 Commits

Author SHA1 Message Date
Shiyou Yin f3fe2cb5f7
swscale: [LA] Optimize range convert for yuvj420p.
Reviewed-by: 陈昊 <chenhao@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-04-11 23:53:41 +02:00
Michael Niedermayer 1a9eda65d0
swscale/utils: Fix xInc overflow
Fixes: signed integer overflow: 2 * 1073741824 cannot be represented in type 'int'
Fixes: 67802/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-6249515855183872

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-04-04 19:38:29 +02:00
Andreas Rheinhardt 790f793844 avutil/common: Don't auto-include mem.h
There are lots of files that don't need it: The number of object
files that actually need it went down from 2011 to 884 here.

Keep it for external users in order to not cause breakages.

Also improve the other headers a bit while just at it.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-31 00:08:43 +01:00
Andreas Rheinhardt 2d38141ea6 swscale/swscale_internal: Don't export internal function
sws_alloc_set_opts() can actually be made internal to utils.c.
This commit does so.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-31 00:08:42 +01:00
Michael Niedermayer e9cc9e492f
libswscale/utils: Fix bayer to yuvj
Fixes: out of array access.

Earlier code assumes that a unscaled bayer to yuvj420 converter exists
but the later code then skips yuvj420

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-02-21 18:24:17 +01:00
Michael Niedermayer f9906911f0
Revert "swscale: fix sws_setColorspaceDetails after sws_init_context"
Suggested by: Niklas Haas in Ticket10824

Fixes: Assertion failure
Fixes: Ticket10824

This reverts commit cedf589c09.
2024-02-21 18:24:17 +01:00
Michael Niedermayer 18f26f8a2f
swscale/utils: Allocate more dithererror
Fixes: out of array read
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-02-21 18:24:16 +01:00
Niklas Haas d043e5c54c swscale: don't omit ff_sws_init_range_convert for high-bit
This was a complete hack seemingly designed to work around a different
bug, which was fixed in the previous commit. As such, there is no more
reason not to do this, as it simply breaks changing color range in
sws_setColorspaceDetails for no reason.
2023-11-09 12:53:35 +01:00
Niklas Haas cedf589c09 swscale: fix sws_setColorspaceDetails after sws_init_context
More commonly, this fixes the case of sws_setColorspaceDetails after
sws_getContext, since the latter implies sws_init_context.

The problem here is that sws_init_context sets up the range conversion
and fast path tables based on the values of srcRange/dstRange at init
time. This may result in locking in a "wrong" path (either using
unscaled fast path when range conversion later required, or using
scaled slow path when range conversion becomes no longer required).

There are two way outs:

1. Always initialize range conversion and unscaled converters, even if
   they will be unused, and extend the runtime check.
2. Re-do initialization if the values change after
   sws_setColorspaceDetails.

I opted for approach 1 because it was simpler and easier to reason
about.

Reword the av_log message to make it clear that this special converter
is not necessarily used, depending on whether or not there is range
conversion or YUV matrix conversion going on.
2023-11-09 12:53:35 +01:00
Paul B Mahol 29b673bdcf swscale: add GBRAP14 format support 2023-09-28 19:37:58 +02:00
Andreas Rheinhardt f8503b4c33 avutil/internal: Don't auto-include emms.h
Instead include emms.h wherever it is needed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-04 11:04:45 +02:00
L. E. Segovia ddc1cd5cdd configure: Set WIN32_LEAN_AND_MEAN at configure time
Including winsock2.h or windows.h without WIN32_LEAN_AND_MEAN cause
bzlib.h to parse as nonsense, due to an instance of #define char small
in rpcndr.h.

See:

https://stackoverflow.com/a/27794577

Signed-off-by: L. E. Segovia <amy@amyspark.me>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-08-14 22:57:28 +03:00
Lynne 934525eae0
lsws: add in/out support for the new 12-bit 2-plane 422 and 444 pixfmts 2023-05-29 00:41:35 +02:00
Lu Wang 4501b1dfd7
swscale/la: Optimize the functions of the swscale series with lsx.
./configure --disable-lasx
ffmpeg -i ~/media/1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -s 640x480
-pix_fmt bgra -y /dev/null -an
before: 91fps
after:  160fps

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-05-25 21:05:08 +02:00
Tomas Härdin a678b0c252 sws/utils.c: Do not uselessly call initFilter() when unscaling 2023-02-08 15:53:55 +01:00
Andreas Rheinhardt 1ff9c07fa6 swscale/utils: Fix indentation
Forgotten after c1eb3e7fec.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-11-24 21:02:57 +01:00
Andreas Rheinhardt b2d1a25816 swscale/utils: Derive range from YUVJ-pix-fmt only once
Currently, it is done once per slice-thread, leading to
one warning per slice-thread in case a YUVJ pixel format
has been originally used.

This also fixes the anomaly that said parameter are only
updated for the user-facing context (whose values are retrievable
via av_opt_get()) if slice-threading is not in use.

Fixes ticket #9860.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-11-24 20:59:03 +01:00
Andreas Rheinhardt ff39dcb129 swscale/utils: Move functions to avoid forward declarations
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-11-24 20:58:21 +01:00
Andreas Rheinhardt baccc1c541 swscale/utils: Avoid calling ff_thread_once() unnecessarily
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-11-24 20:58:21 +01:00
Andreas Rheinhardt 8ee0711228 swscale/utils: Don't allocate AVFrames for slice contexts
Only the parent context's AVFrames are ever used.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-11-24 20:58:21 +01:00
Andreas Rheinhardt 64ed1d40df swscale/utils: Factor initializing single slice context out
Initializing slice threads currently uses the function
(sws_init_context()) that is also used for initializing
user-facing contexts with the only difference being that
nb_threads is set to one before initializing the slice contexts.

Yet sws_init_context() also initializes lots of stuff
that is not slice-dependent, i.e. (src|dst)Range. This
currently only works because the code sets these fields
to the same values for all slice contexts. This is not
nice; even worse, it entails that log messages are printed
once per slice context (and therefore fill the screen).

This commit lays the groundwork to fix this.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-11-24 20:58:21 +01:00
Andreas Rheinhardt b616b04704 swscale/utils: Remove obsolete 3DNow reference
swscale does not use 3DNow any more since commit
608319a311.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-11-09 17:39:00 +01:00
Hao Chen 38cacce22a
swscale/la: Optimize hscale functions with lasx.
ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -s 640x480 -y /dev/null -an
before: 101fps
after:  138fps

Signed-off-by: Hao Chen <chenhao@loongson.cn>
Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-09-10 22:56:38 +02:00
Philip Langdale 09a8e5debb swscale/output: add support for Y210LE and Y212LE 2022-09-10 12:29:12 -07:00
Philip Langdale 68181623e9 swscale/output: add support for XV30LE 2022-09-10 12:29:12 -07:00
Philip Langdale 366f073c62 swscale/output: add support for XV36LE 2022-09-10 12:29:12 -07:00
Philip Langdale caf8d4d256 swscale/output: add support for P012
This generalises the existing P010 support.
2022-09-10 12:29:12 -07:00
Philip Langdale 4a59eba227 swscale/input: add support for Y212LE 2022-09-06 12:49:10 -07:00
Philip Langdale 198b5b90d5 swscale/input: add support for XV30LE 2022-09-06 12:49:10 -07:00
Philip Langdale 5bdd726115 swscale/input: add support for P012
As we now have three of these formats, I added macros to generate the
conversion functions.
2022-09-06 12:49:10 -07:00
Philip Langdale 8d9462844a swscale/input: add support for XV36LE 2022-09-06 12:49:10 -07:00
Philip Langdale 45726aa117 libswscale: add support for VUYX format
As we already have support for VUYA, I figured I should do the small
amount of work to support VUYX as well. That means a little refactoring
to share code.
2022-08-25 19:03:49 -07:00
Timo Rothenpieler aca569aad2 swscale/input: add rgbaf16 input support
This is by no means perfect, since at least ddagrab will return scRGB
data with values outside of 0.0f to 1.0f for HDR values.
Its primary purpose is to be able to work with the format at all.
2022-08-19 22:09:36 +02:00
Alan Kelly a38293e444 libswscale: Enable hscale_avx2 for all input sizes.
ff_shuffle_filter_coefficients shuffles the tail as required.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2022-08-18 16:24:48 +02:00
James Almer 1974813261 swscale/output: add VUYA output support
Signed-off-by: James Almer <jamrial@gmail.com>
2022-08-07 09:33:16 -03:00
James Almer f0abd07996 swscale/input: add VUYA input support
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: James Almer <jamrial@gmail.com>
2022-08-05 09:39:21 -03:00
Matthieu Bouron 0a6bb7da55 swscale: add NV16 input/output
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2022-07-19 12:20:16 +02:00
Andreas Rheinhardt 40e6575aa3 all: Replace if (ARCH_FOO) checks by #if ARCH_FOO
This is more spec-compliant because it does not rely
on dead-code elimination by the compiler. Especially
MSVC has problems with this, as can be seen in
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/296373.html
or
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/297022.html

This commit does not eliminate every instance where we rely
on dead code elimination: It only tackles branching to
the initialization of arch-specific dsp code, not e.g. all
uses of CONFIG_ and HAVE_ checks. But maybe it is already
enough to compile FFmpeg with MSVC with whole-programm-optimizations
enabled (if one does not disable too many components).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-06-15 04:56:37 +02:00
Swinney, Jonathan 0ea61725b1 swscale/aarch64: add hscale specializations
This patch adds code to support specializations of the hscale function
and adds a specialization for filterSize == 4.

ff_hscale8to15_4_neon is a complete rewrite. Since the main bottleneck
here is loading the data from src, this data is loaded a whole block
ahead and stored back to the stack to be loaded again with ld4. This
arranges the data for most efficient use of the vector instructions and
removes the need for completion adds at the end. The number of
iterations of the C per iteration of the assembly is increased from 4 to
8, but because of the prefetching, there must be a special section
without prefetching when dstW < 16.

This improves speed on Graviton 2 (Neoverse N1) dramatically in the case
where previously fs=8 would have been required.

before: hscale_8_to_15__fs_8_dstW_512_neon: 1962.8
after : hscale_8_to_15__fs_4_dstW_512_neon: 1220.9

Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-05-28 01:09:05 +03:00
Andreas Rheinhardt f2b79c5b85 lib*/version: Move library version functions into files of their own
This avoids having to rebuild big files every time FFMPEG_VERSION
changes (which it does with every commit).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-05-10 06:49:32 +02:00
Martin Storsjö 6cd2ac388d libswscale: Split version.h
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-03-16 14:05:26 +02:00
Martin Storsjö c523724c69 swscale: Take the destination range into account for yuv->rgb->yuv conversions
The range parameters need to be set up before calling
sws_init_context (which selects which fastpaths can be used;
this gets called by sws_getContext); solely passing them via
sws_setColorspaceDetails isn't enough.

This fixes producing full range YUV range output when doing
YUV->YUV conversions between different YUV color spaces.

Signed-off-by: Martin Storsjö <martin@martin.st>
2022-02-25 11:01:17 +02:00
Alan Kelly e534d98af3 libswscale: Re-factor ff_shuffle_filter_coefficients.
Make the code more readable and follow the style guide.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-02-17 17:17:22 +01:00
Alan Kelly f1a5414c97 libswscale: Check and propagate memory allocation errors from ff_shuffle_filter_coefficients.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-02-17 17:17:07 +01:00
rcombs 88d804b7ff swscale: add P210/P410/P216/P416 output 2021-12-22 18:38:40 -06:00
Alan Kelly eebe406c80 libswscale: Test AV_CPU_FLAG_SLOW_GATHER for hscale functions.
This is instead of EXTERNAL_AVX2_FAST so that the avx2 hscale functions
are only used where they are faster.
2021-12-21 17:44:53 -03:00
Alan Kelly f900a19fa9 libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.
Fixes so that fate under 64 bit Windows passes.

These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available.

Signed-off-by: James Almer <jamrial@gmail.com>
2021-12-15 20:04:59 -03:00
rcombs f0204de47d swscale: add P210/P410/P216/P416 input 2021-11-28 16:40:43 -06:00
Michael Niedermayer 5f3a160b42 swscale/utils: Improve return codes of sws_setColorspaceDetails()
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-10-24 16:54:36 +02:00
Michael Niedermayer c7699f95bb swscale/utils: Set all threads to the same colorspace even on failure
Fixes: ./ffplay dav.y4m -vf "scale=hd1080:threads=4"
Found-by: Paul
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-10-24 16:54:36 +02:00