ffmpeg

Author	SHA1	Message	Date
Anton Khirnov	a91e6c927e	sws: simplify setting sliceDir	2021-07-03 16:09:21 +02:00
Anton Khirnov	ff753f41dd	sws: merge handling frame start into a single block Also, return an error code on failure rather than 0.	2021-07-03 16:09:07 +02:00
Anton Khirnov	1b11a324fe	sws: make checking for the start of a new frame more explicit	2021-07-03 16:07:22 +02:00
Anton Khirnov	0fb014b7bb	sws: reset sliceDir at the end of sws_scale() Makes it more clear that resetting it does not interact with the scaling code that it is currently intermixed with.	2021-07-03 16:05:39 +02:00
Anton Khirnov	1f80789bf7	sws: rename SwsContext.swscale to convert_unscaled That function pointer is now used only for unscaled conversion.	2021-07-03 15:57:53 +02:00
Anton Khirnov	fe490ec165	sws: separate the calls to scaled vs unscaled conversion Call the scaler function directly rather than through a function pointer. Drop the now-unused return value from ff_getSwsFunc() and rename the function to reflect its new role. This will be useful in the following commits, where it will become important that the amount of output is different for scaled vs unscaled case.	2021-07-03 15:57:13 +02:00
Anton Khirnov	0f8e0957d2	sws: do not reallocate scratch buffers for each slice	2021-07-03 15:56:16 +02:00
Anton Khirnov	2730639259	sws: group the parameters validity checks together Also, fail with an error code rather than 0.	2021-07-03 15:31:18 +02:00
Anton Khirnov	c05cab34a9	sws: initialize {src,dst}Stride2 consistently with {src,dst}2	2021-07-03 15:31:08 +02:00
Anton Khirnov	d3d8e09640	sws: cosmetics Reindent after previous commit, rewrap long lines.	2021-07-03 15:30:56 +02:00
Anton Khirnov	f136493d03	sws: factor out cascaded scaling	2021-07-03 15:30:34 +02:00
Anton Khirnov	a2254aedc9	sws: cosmetics Reindent after previous commit, split long lines.	2021-07-03 15:30:20 +02:00
Anton Khirnov	44f12718bf	sws: factor out gamma-correct scaling	2021-07-03 15:29:50 +02:00
Anton Khirnov	e355af9be9	sws: return an error code on invalid parameters to sws_scale()	2021-07-03 15:29:35 +02:00
Anton Khirnov	21a4e48f88	sws: reindent after previous commit	2021-07-03 15:29:22 +02:00
Anton Khirnov	27acca1af0	sws: factor out updating the palette	2021-07-03 15:28:46 +02:00
Anton Khirnov	f8c21ccbfc	sws: remove unnecessary braces There used to be more code inside them, but it was removed in `6de58b4903`.	2021-07-03 15:28:36 +02:00
Peter Lundblad	da0abbbb01	libswscale: Make sws_init_context thread safe. Call ff_sws_rgb2rgb_init via ff_thread_once instead of checking one of the variables it updates. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2021-07-01 23:49:41 +02:00
Limin Wang	43295ae6a9	swscale/swscale_unscaled: don't use the optimized bgr24toYV12 unscaled conversion when width%2 Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Limin Wang <lance.lmwang@gmail.com>	2021-06-06 12:34:05 +08:00
Anton Khirnov	85ba17f36d	Bump major versions of all libraries. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-04-27 11:48:05 -03:00
Andreas Rheinhardt	ea2d9b7a2e	libswscale: Remove unused deprecated functions, make used ones static Deprecated in `3b905b9fe6`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Signed-off-by: James Almer <jamrial@gmail.com>	2021-04-27 10:43:11 -03:00
Andreas Rheinhardt	f3c197b129	Include attributes.h directly Some files currently rely on libavutil/cpu.h to include it for them; yet said file won't use include it any more after the currently deprecated functions are removed, so include attributes.h directly. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-04-19 14:34:10 +02:00
Alan Kelly	3ce8d09244	libswscale/x86/yuv2yuvX: Removes unrolling for mmx and mmxext Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2021-04-01 20:47:52 +02:00
Alan Kelly	dc57762cb4	libswscale/x86/swscale: Only call ff_yuv2yuvX functions if the input size is > 0 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2021-04-01 20:47:52 +02:00
Michael Niedermayer	c361fa9e21	Bump minor versions after release branch Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2021-03-20 01:02:11 +01:00
Michael Niedermayer	c67d2a2875	Bump Versions before release/4.4 branch Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2021-03-20 01:01:12 +01:00
Andreas Rheinhardt	c23a5523b5	swscale/x86/swscale: Remove unused ASM constants The last user of g15Mask, r15Mask, g16Mask and r16Mask was disabled in `77a416e8aa` and finally removed in 36e8de07ed62609df45d064b56501e3084d25723; b15Mask and b16Mask were apparently always unused (except for in_asm_used_var_warning_killer, a function that only existed to make the compiler not optimize ASM constants away). w10 is unused since `d604bab901`, w02 since `ef423a6618`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>	2021-02-24 09:47:54 +01:00
Andreas Rheinhardt	aad597a93c	swscale/x86/rgb2rgb: Remove unused ASM constants mask24hh etc. are unused since `f099fbf5f3`, mask32b and mask32r since `296609f859`, mask32g since `b38d487466` and mask32 since `f8a138be52`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>	2021-02-24 09:45:17 +01:00
Andreas Rheinhardt	49db6e4b4e	swscale/x86/yuv2rgb: Remove unused ASM constants mmx_grnmask is unused since `531f97b0c3`, the other constants since `e934194b6a`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>	2021-02-24 09:43:14 +01:00
Chip Kerchner	e7f53d6ac9	lsws/ppc/yuv2rgb_altivec: Fix build in non-VSX environments Add inline function for vec_xl if VSX is not supported. vec_xl intrinsic is only available on POWER 7 or higher. Fixes ticket #8750. Signed-off-by: Andriy Gelman <andriy.gelman@gmail.com>	2021-02-22 23:19:21 -05:00
James Almer	1a555d3c60	swscale/x86/yuv2yuvX: use the movsxdifnidn helper macro Simplifies code Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-18 18:47:43 -03:00
James Almer	ebb48d85a0	swscale/x86/yuv2yuvX: use movq to load 8 bytes in all non-AVX2 functions mova expands to movq on non-XMM functions Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-18 18:47:43 -03:00
James Almer	d512ebbaed	swscale/x86/yuv2yuvX: use the SPLATW helper macro Simplifies code Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-18 18:47:43 -03:00
James Almer	c00567647e	swscale/x86/swscale: fix mix of inline and external function definitions This includes removing pointless static function forward declarations. Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-18 18:47:42 -03:00
James Almer	c2bf1dcace	swscale/x86/swscale: fix compilation with old yasm Where AVX2 may not be supported. Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-17 21:09:36 -03:00
Alan Kelly	554c2bc708	swscale: move yuv2yuvX_sse3 to yasm, unrolls main loop And other small optimizations for ~20% speedup.	2021-02-17 21:21:03 +01:00
Carl Eugen Hoyos	2687070d9b	lsws/ppc/yuv2rgb: Fix transparency converting from yuv->rgb32. Based on `68363b69` by Reimar Döffinger. Fixes ticket #9077.	2021-01-24 17:17:29 +01:00
Anton Khirnov	e15371061d	lavu/mem: move the DECLARE_ALIGNED macro family to mem_internal on next+1 bump They are not properly namespaced and not intended for public use.	2021-01-01 14:14:57 +01:00
Anton Khirnov	c8c2dfbc37	lavu: move LOCAL_ALIGNED from internal.h to mem_internal.h That is a more appropriate place for it.	2021-01-01 14:11:01 +01:00
Jeremy Leconte	29cef1bcd6	libswscale: avoid UB nullptr-with-offset. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-12-24 15:27:56 +01:00
Andriy Gelman	1200264fc4	swscale/rgb2rgb_template: use shuffle macro on big-endian arches Fixes fate-qtrle-32bit on big-endian. The macro does a simple byte swap on uint8 array without any casts, so it's valid on big-endian arches. The mentioned test was failing because the byteswap function shuffle_bytes_3210_c() is used in the pixel format conversion (argb->bgra). Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andriy Gelman <andriy.gelman@gmail.com>	2020-12-12 23:07:22 -05:00
Carl Eugen Hoyos	46e362b765	lsws/x86/yuv2rgb: Fix compilation with mmxext or ssse3 disabled. Fixes ticket #8986.	2020-11-14 15:37:57 +01:00
Marton Balint	993429cfb4	swscale/x86/yuv2rgb: fix crashes when loading alpha from unaligned buffers Regression since `fc6a5883d6` on SSSE3 enabled CPUs. Fixes ticket #8955. Signed-off-by: Marton Balint <cus@passwd.hu>	2020-11-02 00:31:34 +01:00
Jan Ekström	7ea4bcff7b	swscale/utils: override forced-zero formats back to full range Fixes vf_scale outputting RGB AVFrames with limited range flagged in case either input or output specifically sets the range. This is the reverse of the logic utilized for RGB and PAL8 content in sws_setColorspaceDetails.	2020-10-11 12:58:13 +03:00
Jan Ekström	3fe24fe232	swscale/utils: split range override check into its own function	2020-10-11 12:58:13 +03:00
Mark Reid	a48adcd136	libswcale/input: use more accurate planer rgb16 yuv conversions These conversion appears to be exhibiting the same rounding error as the rgbf32 formats where. I seperated the rounding value from the 16 and 128 offsets, I think it makes it a little more clear. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-10-06 17:56:52 +02:00
Mark Reid	453004fde6	libswcale/input: use more accurate rgbf32 yuv conversions Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-10-02 14:59:52 +02:00
Mark Reid	6bf57c6a2a	libswscale/tests: add floatimg_cmp test changes since v1: - made into fate test - fixed c90 warnings - tests more intermediate formats - tested on BE mips too Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-10-02 14:59:52 +02:00
James Almer	621e2625e0	swscale/x86/output: add missing AVX2 support preprocessor wrappers Fixes compilation with old yasm Signed-off-by: James Almer <jamrial@gmail.com>	2020-08-20 15:14:56 -03:00
Paul B Mahol	9d58cdb4ba	swscale: do not drop half of bits from 16bit bayer formats	2020-08-08 12:03:42 +02:00
Limin Wang	7c8ad72f1c	swscale/yuv2rgb: cosmetics Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Limin Wang <lance.lmwang@gmail.com>	2020-07-25 10:20:42 +08:00
Fei Wang	8544783280	swscale/yuv2rgb: consider x2rgb10le on big endian hardware This fixed FATE fail report by filter-pixfmts* for x2rgb10le on big endian hardware. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-07-20 21:00:00 +02:00
Michael Niedermayer	663f024415	swscale/tests/swscale: use 1 for indicating erros Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-07-16 17:44:53 +02:00
Michael Niedermayer	24c575e0aa	swscale/tests/swscale: Initialize res to a non random error code Regression since: `3adffab073` -1 is consistent what other error paths return Reviewed-by: Martin Storsjö <martin@martin.st> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-07-14 22:05:02 +02:00
Michael Niedermayer	ec27c1827c	swscale/tests/swscale: Fix incorrect return code check Regression since: `3adffab073` Reviewed-by: Martin Storsjö <martin@martin.st> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-07-14 22:05:02 +02:00
James Almer	ba3e771a42	x86/yuv2rgb: fix crashes when storing data on unaligned buffers Regression since `fc6a5883d6` on SSSE3 enabled CPUs. Fixes ticket #8747 Signed-off-by: James Almer <jamrial@gmail.com>	2020-07-14 14:06:04 -03:00
Lynne	3adffab073	swscale/tests: check return value of sws_scale	2020-07-09 10:33:19 +01:00
Lynne	3e098cca6e	aarch64/yuv2rgb_neon: fix return value We return 0 for this particular architecture but should instead be returning the number of lines. Fixes users who check the return value matches what they expect.	2020-07-09 10:33:14 +01:00
Nelson Gomez	360be03b8a	swscale: cosmetic fixes Signed-off-by: Nelson Gomez <nelson.gomez@microsoft.com>	2020-06-14 16:34:07 +01:00
Nelson Gomez	bc01337db4	swscale/x86/output: add AVX2 version of yuv2nv12cX 256 bits is just wide enough to fit all the operands needed to vectorize the software implementation, but AVX2 is needed to for a couple of instructions like cross-lane permutation. Output is bit-for-bit identical to C. Signed-off-by: Nelson Gomez <nelson.gomez@microsoft.com>	2020-06-14 16:34:07 +01:00
Nelson Gomez	7c39c3c1a6	swscale: make yuv2interleavedX more asm-friendly Extracting information from SwsContext in assembly is difficult, and rearranging SwsContext just for asm access didn't look good. These functions only need a couple of fields from it anyway, so just make them parameters in their own right. Signed-off-by: Nelson Gomez <nelson.gomez@microsoft.com>	2020-06-14 16:34:07 +01:00
Limin Wang	67a07dc778	swscale/utils: return better error code from initFilter() Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Limin Wang <lance.lmwang@gmail.com>	2020-06-14 21:54:40 +08:00
Limin Wang	8efecc9063	swscale/utils: reindent Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Limin Wang <lance.lmwang@gmail.com>	2020-06-14 21:54:40 +08:00
Limin Wang	a408d03ee6	swscale/utils: remove FF_ALLOC_ARRAY_OR_GOTO macros Signed-off-by: Limin Wang <lance.lmwang@gmail.com>	2020-06-13 06:59:19 +08:00
Fei Wang	c721b45014	swscale: Add swscale input/output support for X2RGB10LE Signed-off-by: Fei Wang <fei.w.wang@intel.com>	2020-06-12 17:56:15 +01:00
Michael Niedermayer	c5079bf3bc	Bump minor versions after branching 4.3 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-06-08 22:49:04 +02:00
Michael Niedermayer	0a8a96c251	Bump minor versions to separate 4.3 from master Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-06-08 22:49:04 +02:00
Martin Storsjö	e0604d508e	swscale: aarch64: Add a NEON implementation of interleaveBytes This allows speeding up format conversions from yuv420 to nv12. Cortex A53 A72 A73 interleave_bytes_c: 86077.5 51433.0 66972.0 interleave_bytes_neon: 19701.7 23019.2 15859.2 interleave_bytes_aligned_c: 86603.0 52017.2 67484.2 interleave_bytes_aligned_neon: 9061.0 7623.0 6309.0 Signed-off-by: Martin Storsjö <martin@martin.st>	2020-05-15 23:38:17 +03:00
Josh de Kock	70b14cc8d6	swscale: arm: fix NEON hscale init The NEON hscale function only supports X8 filter sizes and should only be selected when these are being used. At the moment filterAlign is set to 8 but in the future when extra NEON assembly for specific sizes is added they will need to have checks here too. The immediate usecase for this change is making the hscale checkasm test easier and without NEON specific edge-cases (x86 already has these guards). This applies the same fix from `718c8f9aa5` on the 32 bit arm version of the function, fixing fate-checkasm-sw_scale there. Signed-off-by: Martin Storsjö <martin@martin.st>	2020-05-15 23:33:46 +03:00
Josh de Kock	718c8f9aa5	swscale: fix NEON hscale init The NEON hscale function only supports X8 filter sizes and should only be selected when these are being used. At the moment filterAlign is set to 8 but in the future when extra NEON assembly for specific sizes is added they will need to have checks here too. The immediate usecase for this change is making the hscale checkasm test easier and without NEON specific edge-cases (x86 already has these guards). Signed-off-by: Josh de Kock <josh@itanimul.li>	2020-05-15 10:29:30 +01:00
Mark Reid	fabeef22d9	libswscale: fix for floating point formats, require full chroma upon more floating point testing, looks like I missed adding this bit. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-05-12 01:00:28 +02:00
Mark Reid	b4967fc71c	libswscale: add output support for AV_PIX_FMT_GBRAPF32 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-05-05 20:06:58 +02:00
Mark Reid	ba5d0515a6	libswscale: add input support AV_PIX_FMT_GBRAPF32 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-05-05 20:06:58 +02:00
Andreas Rheinhardt	2fae000994	swscale/vscale: Increase type strictness libswscale/vscale.c makes extensive use of function pointers and in doing so it converts these function pointers to and from a pointer to void. Yet this is actually against the C standard: C90 only guarantees that one can convert a pointer to any incomplete type or object type to void* and back with the result comparing equal to the original which makes pointers to void generic pointers to incomplete or object type. Yet C90 lacks a generic function pointer type. C99 additionally guarantees that a pointer to a function of one type may be converted to a pointer to a function of another type with the result and the original comparing equal when converting back. This makes any function pointer type a generic function pointer type. Yet even this does not make pointers to void generic function pointers. Both GCC and Clang emit warnings for this when in pedantic mode. This commit fixes this by using a union that can hold one member of any of the required function pointer types to store the function pointer. This works even for C90. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>	2020-04-27 23:34:31 +02:00
Martin Storsjö	9025d5c5ce	swscale: aarch64: Don't clobber callee-saved registers v8-v15 Signed-off-by: Martin Storsjö <martin@martin.st>	2020-04-21 23:41:13 +03:00
Martin Storsjö	872790b1f9	swscale: aarch64: Avoid using the x18 register The x18 is a reserved platform register on Darwin and Windows. x8/w8 seems to be unused in this function though (and same about x10 and x14), so there's really no reason to use x18 here - just change the uses of x18/w18 into x8/w8 instead without any further rewrites. Signed-off-by: Martin Storsjö <martin@martin.st>	2020-04-20 00:09:34 +03:00
Michael Niedermayer	be3c29e379	swscale/yuv2rgb: Fix vertical dither offset with slices Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-04-12 16:36:47 +02:00
Michael Niedermayer	e057e83a4f	swscale/output: Fix integer overflow in yuv2rgb_write_full() with out of range input Fixes: signed integer overflow: 1169365504 + 981452800 cannot be represented in type 'int' Fixes: ticket8293 Found-by: Suhwan Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-04-04 22:09:46 +02:00
Michael Niedermayer	49ba1879ad	swscale/output: Fix integer overflow in alpha computation in yuv2gbrp16_full_X_c() Fixes: signed integer overflow: 524280 * 4432 cannot be represented in type 'int' Fixes: ticket8322 Found-by: Suhwan Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-04-04 22:09:46 +02:00
Ruiling Song	4700f7d6fc	swscale/swscale: remove useless code Signed-off-by: Ruiling Song <ruiling.song@intel.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-04-03 00:58:07 +02:00
Carl Eugen Hoyos	5f8c383452	lsws/input: Do not change transparency range. Fixes ticket #8509.	2020-03-11 22:55:49 +01:00
Ting Fu	828f7db5d9	libswscale/x86/yuv2rgb: Fix Segmentation Fault when load unaligned data Fixes ticket #8532 Signed-off-by: Ting Fu <ting.fu@intel.com>	2020-02-26 11:10:46 +01:00
Linjie Fu	d2aa1fbfd4	swscale: Add swscale input support for Y210LE Add swscale input support for Y210LE, output support and fate test could be added later if there is requirement for software CSC to this packed format. Signed-off-by: Linjie Fu <linjie.fu@intel.com>	2020-02-24 00:09:51 +00:00
Ting Fu	fc6a5883d6	libswscale/x86/yuv2rgb: add ssse3 version Tested using this command: /ffmpeg -pix_fmt yuv420p -s 19201080 -i ArashRawYuv420.yuv \ -vcodec rawvideo -s 19201080 -pix_fmt rgb24 -f null /dev/null The fps increase from 389 to 640 on Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz Signed-off-by: Ting Fu <ting.fu@intel.com>	2020-02-10 15:08:33 +01:00
Gautam Ramakrishnan	da399e2135	libswscale/utils.c: Fix bug #8255 Bug #8255 points out a double free error in libwscale/utils.c file. The double free is because the pointer to cascaded_context of an sw_context is not set to NULL after freeing it. When the sw_context is later freed, sws_freeContext is called on the cascaded_context, causing a double free. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-02-09 23:33:18 +01:00
Ting Fu	e934194b6a	libswscale/x86/yuv2rgb: Change inline assembly into nasm code The original inline assembly and nasm code have the same fps when called by command. NASM code almost has no impact on the perfromance. Signed-off-by: Ting Fu <ting.fu@intel.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-02-05 17:41:59 +01:00
Michael Niedermayer	d48e510124	swscale/input: Fix several invalid shifts related to rgb2yuv constants Fixes: Invalid shifts Fixes: #8140 Fixes: #8146 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-01-22 21:50:49 +01:00
Michael Niedermayer	7b7f97532b	swscale/output: Fix several invalid shifts in yuv2rgb_full_1_c_template() Fixes: Invalid shifts Fixes: #8320 Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-01-22 18:41:46 +01:00
Michael Niedermayer	a6ca22c118	swscale/swscale: Fix several invalid shifts related to vChrDrop Fixes: Invalid shifts Fixes: #8166 Fixes: filter-crop_scale_vflip FATE-test Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-01-22 18:41:46 +01:00
Carl Eugen Hoyos	96fab29e96	Silence "string-plus-int" warning shown by clang. libswscale/utils.c:89:42: warning: adding 'unsigned long' to a string does not append to the string [-Wstring-plus-int]	2020-01-06 22:38:56 +01:00
Sebastian Pop	c3a17ffff6	swscale/aarch64: use multiply accumulate and shift-right narrow This patch rewrites the innermost loop of ff_yuv2planeX_8_neon to avoid zips and horizontal adds by using fused multiply adds. The patch also uses ld1r to load one element and replicate it across all lanes of the vector. The patch also improves the clipping code by removing the shift right instructions and performing the shift with the shift-right narrow instructions. I see 8% difference on an m6g instance with neoverse-n1 CPUs: $ ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf bench=start,scale=1024x1024,bench=stop -f null - before: t:0.014015 avg:0.014096 max:0.015018 min:0.013971 after: t:0.012985 avg:0.013013 max:0.013996 min:0.012818 Tested with `make check` on aarch64-linux. Signed-off-by: Sebastian Pop <spop@amazon.com> Reviewed-by: Clément Bœsch <u@pkh.me> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-01-04 20:59:31 +01:00
Zhao Zhili	1e3e547a5b	swscale/utils: remove access of AV_PIX_FMT_NB Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2019-12-31 12:37:47 +01:00
Sebastian Pop	bd83191271	swscale/aarch64: use multiply accumulate and increase vector factor to 4 This patch implements ff_hscale_8_to_15_neon with NEON fused multiply accumulate and bumps the vectorization factor from 2 to 4. The speedup is of 25% on Graviton1 A1 instances based on A-72 cpus: $ ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf bench=start,scale=1024x1024,bench=stop -f null - before: t:0.040303 avg:0.040287 max:0.040371 min:0.039214 after: t:0.032168 avg:0.032215 max:0.033081 min:0.032146 The speedup is of 39% on Graviton2 m6g instances based on Neoverse-N1 cpus: $ ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf bench=start,scale=1024x1024,bench=stop -f null - before: t:0.019446 avg:0.019423 max:0.019493 min:0.019181 after: t:0.014015 avg:0.014096 max:0.015018 min:0.013971 Tested with `make check` on aarch64-linux. Signed-off-by: Sebastian Pop <spop@amazon.com> Reviewed-by: Jean-Baptiste Kempf <jb@videolan.org> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2019-12-17 23:41:47 +01:00
Limin Wang	8558c231fb	swscale/swscale_unscaled: add AV_PIX_FMT_GBRAP10 for LE and BE conversion wrapper Signed-off-by: Limin Wang <lance.lmwang@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2019-12-10 16:09:14 +01:00
Ting Fu	039a0ebe6f	libswscale/swscale_unscaled.c: remove redundant code Signed-off-by: Ting Fu <ting.fu@intel.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2019-12-06 11:25:29 +01:00
Limin Wang	a5e24be52a	swscale/swscale_unscaled: fix gbrap10be md5 different on big endian system You can reproduce it by below command: ./ffmpeg -f lavfi -i "testsrc=duration=1:rate=30" -vf format=gbrap10 -vcodec rawvideo \ -pix_fmt gbrap10le -flags +bitexact -sws_flags +accurate_rnd+bitexact -fflags +bitexact \ -frames:v 1 -f nut md5: little-endian: f91e2edd8098276579c1929e5e160416 big-endian: ba4d011dbbdc78ccbf6cc7d698630929 Signed-off-by: Limin Wang <lance.lmwang@gmail.com> Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2019-11-01 14:43:16 +01:00
Michael Niedermayer	d260621089	swscale/output: Avoid 64bit in Alpha in yuv2ya16_X_c_template() Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2019-10-16 19:17:57 +02:00
Michael Niedermayer	3e6682931b	swscale/output: Correct Alpha in yuv2ya16_X_c_template() Untested, no testcase Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2019-10-16 19:17:57 +02:00
Michael Niedermayer	4f4ca675e5	swscale/output: Implement Luma computation from yuv2ya16_X_c_template() without 64bit This also reverts `21838cad2f` The revert is in this commit to avoid 2 fate updates Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2019-10-16 19:17:57 +02:00
Daniel Kolesa	e6625ca41f	swscale: Fix AltiVec/VSX build with recent GCC The argument to vec_splat_u16 must be a literal. By making the function always inline and marking the arguments const, gcc can turn those into literals, and avoid build errors like: swscale_vsx.c:165:53: error: argument 1 must be a 5-bit signed literal Fixes #7861. Signed-off-by: Daniel Kolesa <daniel@octaforge.org> Signed-off-by: Lauri Kasanen <cand@gmx.com>	2019-10-04 08:58:17 +03:00

1 2 3 4 5 ...

2391 Commits