ffmpeg

Author	SHA1	Message	Date
Diego Biurrun	fd502f4f5f	build: Generalize yasm/nasm-related variable names None of them are specific to the YASM assembler. (Cherry-picked from libav commit `39e208f4d4`) Signed-off-by: James Almer <jamrial@gmail.com>	2017-06-21 17:00:29 -03:00
Paul B Mahol	49bbfb9d13	avfilter: add arbitrary audio FIR filter Signed-off-by: Paul B Mahol <onemda@gmail.com>	2017-05-09 20:47:52 +02:00
Muhammad Faiz	1e69ac9246	avfilter/avf_showcqt: cqt_calc optimization on x86 on x86_64: time PSNR plain 3.303 inf SSE 1.649 107.087535 SSE3 1.632 107.087535 AVX 1.409 106.986771 FMA3 1.265 107.108437 on x86_32 (PSNR compared to x86_64 plain): time PSNR plain 7.225 103.951979 SSE 1.827 105.859282 SSE3 1.819 105.859282 AVX 1.533 105.997661 FMA3 1.384 105.885377 FMA4 test is not available Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>	2016-06-08 16:09:43 +07:00
Ronald S. Bultje	5ce703a6bf	vf_colorspace: x86-64 SIMD (SSE2) optimizations.	2016-04-12 16:42:48 -04:00
Thomas Mundt	5024a82e95	avfilter/vf_bwdif: add x86 SIMD Signed-off-by: Thomas Mundt <loudmax@yahoo.de>	2016-03-13 10:06:21 +01:00
Paul B Mahol	5740dc27e1	avfilter/vf_w3fdif: add x86 SIMD Signed-off-by: Paul B Mahol <onemda@gmail.com>	2015-10-10 17:33:43 +02:00
Paul B Mahol	ac74e857a2	avfilter/vf_stereo3d: add x86 SIMD for anaglyph outputs Signed-off-by: Paul B Mahol <onemda@gmail.com>	2015-10-06 21:01:24 +02:00
Paul B Mahol	9762554dd0	avfilter/vf_blend: add x86 SIMD for some modes Signed-off-by: Paul B Mahol <onemda@gmail.com>	2015-10-03 21:26:17 +02:00
Paul B Mahol	160556c9ad	avfilter/vf_maskedmerge: add SIMD for maskedmerge with 8 bit depth input Signed-off-by: Paul B Mahol <onemda@gmail.com>	2015-10-02 17:40:57 +02:00
James Darnley	bff7242608	avfilter/vf_removegrain: add x86 and x86_64 SSE2 functions Speed of all modes increased by a factor between 7.4 and 19.8 largely depending on whether bytes are unpacked into words. Modes 2, 3, and 4 have been sped-up by a factor of 43 (thanks quick sort!) All modes are available on x86_64 but only modes 1, 10, 11, 12, 13, 14, 19, 20, 21, and 22 are available on x86 due to the number of SIMD registers used. With a contribution from James Almer <jamrial@gmail.com>	2015-07-14 23:50:50 +00:00
Ronald S. Bultje	ae4c9ddebc	vf_psnr: sse2 optimizations for sum-squared-error. The internal line accumulator for 16bit can overflow, so I changed that from int to uint64_t in the C code. The matching assembly looks a little weird but output looks correct. (avx2 should be trivial to add later.) Reviewed-by: Paul B Mahol <onemda@gmail.com> Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-07-14 17:57:14 +02:00
Ronald S. Bultje	dfc58584b4	vf_ssim: x86 simd for ssim_4x4xN and ssim_endN. Both are 2-2.5x faster than their C counterpart. Reviewed-by: Paul B Mahol <onemda@gmail.com> Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-07-14 05:07:07 +02:00
Arwa Arif	4c38e960d0	avfilter: Port mp=eq/eq2 to lavfi Code adapted from James Darnley's port Some fixes from Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2015-01-26 00:14:04 +01:00
James Almer	da02ee127a	x86/vf_pp7: port dctB_mmx to yasm Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2015-01-09 20:02:27 -03:00
Arwa Arif	a299cd5ab3	lavfi: port mp=pp7 to libavfilter The only difference with mp=pp7 is that default mode is "medium", as stated in the MPlayer docs, rather than "hard". Signed-off-by: Stefano Sabatini <stefasab@gmail.com>	2015-01-09 17:26:31 +01:00
James Almer	466e32bf25	x86/vf_fspp: port inline asm to yasm Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-12-26 15:39:51 -03:00
Arwa Arif	bdc4db0ee3	lavfi: port mp=fspp to a native libavfilter filter Signed-off-by: Stefano Sabatini <stefasab@gmail.com>	2014-12-24 16:29:18 +01:00
Michael Niedermayer	fb3eb57369	avfilter/tinterlace: add Support for ff_lowpass_line_avx() & ff_lowpass_line_sse2() Based-on: `2e1704059a` by Kieran Kunhya Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-11-15 04:02:33 +01:00
Michael Niedermayer	6f373d75e8	Merge commit '2e1704059ae8625beda2ffde847ad22c5ba416dc' * commit '2e1704059ae8625beda2ffde847ad22c5ba416dc': vf_interlace: Add SIMD for lowpass filter Conflicts: libavfilter/vf_interlace.c libavfilter/x86/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-11-15 02:39:49 +01:00
Kieran Kunhya	2e1704059a	vf_interlace: Add SIMD for lowpass filter Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2014-11-15 00:35:31 +01:00
James Almer	864f9326fb	x86/vf_noise: move asm code to a separate file Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-10-17 00:44:35 -03:00
skal	406a9ccffe	avfilter/vf_idet: MMX/MMXEXT/SSE2 implementation of idet's filter_line() integration by Neil Birkbeck, with help from Vitor Sessak. core SSE2 loop by Skal (pascal.massimino@gmail.com) Reviewed-by: Clément Bœsch <u@pkh.me> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-04 22:19:00 +02:00
Robert Krüger	4a38eeec38	Revert "Revert "vf_yadif: move x86 init code to x86/yadif.c"" This reverts commit `975110a85e`. Signed-off-by: Robert Krüger <krueger@lesspain.de> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-14 14:19:14 +01:00
Michael Niedermayer	975110a85e	Revert "vf_yadif: move x86 init code to x86/yadif.c" This reverts commit `a87b17f328`. This reduces the amount of non LGPL code, making a relicensing to LGPL easier Conflicts: libavfilter/vf_yadif.c libavfilter/x86/yadif.c libavfilter/x86/yadif_template.c libavfilter/yadif.h Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-12-01 20:26:26 +01:00
Michael Niedermayer	1ea28ffc4d	Merge commit '0e730494160d973400aed8d2addd1f58a0ec883e' * commit '0e730494160d973400aed8d2addd1f58a0ec883e': avfilter: x86: Port gradfun filter optimizations to yasm Conflicts: libavfilter/x86/vf_gradfun_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-24 10:35:39 +02:00
Daniel Kang	0e73049416	avfilter: x86: Port gradfun filter optimizations to yasm Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-10-23 14:50:27 +02:00
Paul B Mahol	9c774459a9	avfilter: port pullup filter from libmpcodecs Signed-off-by: Paul B Mahol <onemda@gmail.com>	2013-09-17 17:03:36 +00:00
Clément Bœsch	a2c547ffec	lavfi: add spp filter.	2013-06-14 01:27:22 +02:00
James Darnley	0a5814c9ba	yadif: x86 assembly for 9 to 14-bit samples These smaller samples do not need to be unpacked to double words allowing the code to process more pixels every iteration (still 2 in MMX but 6 in SSE2). It also avoids emulating the missing double word instructions on older instruction sets. Like with the previous code for 16-bit samples this has been tested on an Athlon64 and a Core2Quad. Athlon64: 1809275 decicycles in C, 32718 runs, 50 skips 911675 decicycles in mmx, 32727 runs, 41 skips, 2.0x faster 495284 decicycles in sse2, 32747 runs, 21 skips, 3.7x faster Core2Quad: 921363 decicycles in C, 32756 runs, 12 skips 486537 decicycles in mmx, 32764 runs, 4 skips, 1.9x faster 293296 decicycles in sse2, 32759 runs, 9 skips, 3.1x faster 284910 decicycles in ssse3, 32759 runs, 9 skips, 3.2x faster Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-16 22:32:54 +01:00
James Darnley	17e7b49501	yadif: x86 assembly for 16-bit samples This is a fairly dumb copy of the assembly for 8-bit samples but it works and produces identical output to the C version. The options have been tested on an Athlon64 and a Core2Quad. Athlon64: 1810385 decicycles in C, 32726 runs, 42 skips 1080744 decicycles in mmx, 32744 runs, 24 skips, 1.7x faster 818315 decicycles in sse2, 32735 runs, 33 skips, 2.2x faster Core2Quad: 924025 decicycles in C, 32750 runs, 18 skips 623995 decicycles in mmx, 32767 runs, 1 skips, 1.5x faster 406223 decicycles in sse2, 32764 runs, 4 skips, 2.3x faster 387842 decicycles in ssse3, 32767 runs, 1 skips, 2.4x faster 307726 decicycles in sse4, 32763 runs, 5 skips, 3.0x faster Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-16 22:32:34 +01:00
Diego Biurrun	e66240f22e	avfilter: x86: consistent filenames for filter optimizations	2013-02-04 15:00:47 +01:00
Diego Biurrun	76d90125cd	vf_hqdn3d: x86: Add proper arch optimization initialization	2013-02-01 13:11:45 +01:00
Daniel Kang	899157b308	yadif: Port inline assembly to yasm Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-09 18:41:02 +01:00
Justin Ruggles	f96f1e06a4	x86: af_volume: add SSE2-optimized s16 volume scaling	2012-12-05 11:23:37 -05:00
Diego Biurrun	f6c38c5f4e	avfilter: call x86 init functions under if (ARCH_X86), not if (HAVE_MMX)	2012-10-12 19:58:51 +02:00
Loren Merritt	7a1944b907	vf_hqdn3d: x86 asm 13% faster on penryn, 16% on sandybridge, 15% on bulldozer Not simd; a compiler should have generated this, but gcc didn't.	2012-08-26 10:49:14 +00:00
Nolan L	d5f187fd33	Add gradfun filter, ported from MPlayer. Patch by Nolan L nol888 <=> gmail >=< com. See thread: Subject: [FFmpeg-devel] [PATCH] Port gradfun to libavfilter (GCI) Date: Mon, 29 Nov 2010 07:18:14 -0500 Originally committed as revision 25942 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-12-12 17:59:10 +00:00
Aurelien Jacobs	fa6f4ebc08	use a Makefile in x86 subdir Originally committed as revision 25234 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-09-27 21:50:26 +00:00

38 Commits