Commit Graph

40 Commits

Author SHA1 Message Date
Wu Jianhua
2c734a8496 libswscale/x86/rgb2rgb: add shuffle_bytes avx2
Performance data(Less is better):
    shuffle_bytes_ssse3   3.64654
    shuffle_bytes_avx2    0.94288

Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
2021-10-15 10:59:20 +02:00
Andreas Rheinhardt
aad597a93c swscale/x86/rgb2rgb: Remove unused ASM constants
mask24hh etc. are unused since f099fbf5f3,
mask32b and mask32r since 296609f859,
mask32g since b38d487466 and mask32 since
f8a138be52.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-24 09:45:17 +01:00
Anton Khirnov
e15371061d lavu/mem: move the DECLARE_ALIGNED macro family to mem_internal on next+1 bump
They are not properly namespaced and not intended for public use.
2021-01-01 14:14:57 +01:00
Martin Vignali
296609f859 swscale/x86/rgb2rgb : port shuffle 2103 mmxext to external asm and remove inline asm version 2018-10-13 14:12:41 +02:00
Martin Vignali
07a566e7d6 swscale/swscale_unscaled : add X86_64 (SSE2 and AVX) for uyvyto422
and checkasm test
2018-04-22 19:15:32 +02:00
Martin Vignali
1ba5ca2d72 swscale/rgb : add X86 SIMD (SSSE3), for shuffle_bytes_1230, shuffle_bytes_3012, shuffle_bytes_3210 2018-03-24 20:22:08 +01:00
Martin Vignali
923a324174 swscale/rgb : add X86 SIMD (SSSE3) for shuffle_bytes_2103 and shuffle_bytes_0321 2018-03-24 20:21:58 +01:00
Hendrik Leppkes
c142dc203e Merge commit 'dc40a70c5755bccfb1a1349639943e1f408bea50'
* commit 'dc40a70c5755bccfb1a1349639943e1f408bea50':
  Drop unnecessary libavutil/x86/asm.h #includes

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-06-26 15:53:00 +02:00
Diego Biurrun
dc40a70c57 Drop unnecessary libavutil/x86/asm.h #includes 2016-05-28 19:18:26 +02:00
Matt Oliver
9eb3f11c55 Add missing external declarations.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-17 00:48:09 +01:00
Michael Niedermayer
7597e6efe4 swscale/x86/rgb2rgb: add support for AVX
This does not yet include any actual AVX code

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-01-21 18:01:29 +01:00
Michael Niedermayer
445c58a8c6 swscale/x86/rgb2rgb: Make sure COMPILE_TEMPLATE_AVX is defined
Found-by: iive
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-14 02:54:28 +01:00
Diego Biurrun
c16bfb147d swscale: x86: Consistently use lowercase function name suffixes 2013-11-22 23:01:51 +01:00
Michael Niedermayer
1de064e21e swscale/x86/rgb2rgb: change cpu optim identifiers to lower case
This makes the code more similar to the other optims and allows us
to use the same macros to build function names

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-11-19 15:13:48 +01:00
Michael Niedermayer
4729b529e6 swscale/x86/rgb2rgb: extend framework to also include AVX
This does not yet include any actual AVX code

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-11-19 15:13:48 +01:00
Michael Niedermayer
920dd84bf1 sws/x86: remove 8bit rgb2yuv coefficient case for rgb24toyv12 special converter
This simplifies the code and improves quality at the expense of a slight
slowdown of a rarely used function (no fate test uses it).

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-15 03:19:52 +02:00
Michael Niedermayer
add7513e64 Merge commit 'fa8fcab1e0d31074c0644c4ac5194474c6c26415'
* commit 'fa8fcab1e0d31074c0644c4ac5194474c6c26415':
  x86: h264_chromamc_10bit: drop pointless PAVG %define
  x86: mmx2 ---> mmxext in function names
  swscale: do not forget to swap data in formats with different endianness

Conflicts:
	libavcodec/x86/dsputil_mmx.c
	libavfilter/x86/gradfun.c
	libswscale/input.c
	libswscale/utils.c
	libswscale/x86/swscale.c
	tests/ref/lavfi/pixfmts_scale

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-01 13:11:51 +01:00
Diego Biurrun
d8eda37080 x86: mmx2 ---> mmxext in function names 2012-10-31 17:53:57 +01:00
Michael Niedermayer
78ec407d5a Merge commit '652f5185945c8405fc57aed353286858df8d066f'
* commit '652f5185945c8405fc57aed353286858df8d066f':
  x86: mmx2 ---> mmxext in comments and messages

Conflicts:
	libswscale/x86/swscale_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-31 14:02:35 +01:00
Diego Biurrun
652f518594 x86: mmx2 ---> mmxext in comments and messages 2012-10-31 00:37:42 +01:00
Michael Niedermayer
77aedc77ab Merge remote-tracking branch 'qatar/master'
* qatar/master:
  swscale: Provide the right alignment for external mmx asm
  x86: Replace checks for CPU extensions and flags by convenience macros
  configure: msvc: fix/simplify setting of flags for hostcc
  x86: mlpdsp: mlp_filter_channel_x86 requires inline asm

Conflicts:
	libavcodec/x86/fft_init.c
	libavcodec/x86/h264_intrapred_init.c
	libavcodec/x86/h264dsp_init.c
	libavcodec/x86/mpegaudiodec.c
	libavcodec/x86/proresdsp_init.c
	libavutil/x86/float_dsp_init.c
	libswscale/utils.c
	libswscale/x86/swscale.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-09-09 13:27:42 +02:00
Diego Biurrun
e0c6cce447 x86: Replace checks for CPU extensions and flags by convenience macros
This separates code relying on inline from that relying on external
assembly and fixes instances where the coalesced check was incorrect.
2012-09-08 18:18:34 +02:00
Michael Niedermayer
9f088a1ed4 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  mpegvideo: reduce excessive inlining of mpeg_motion()
  mpegvideo: convert mpegvideo_common.h to a .c file
  build: factor out mpegvideo.o dependencies to CONFIG_MPEGVIDEO
  Move MASK_ABS macro to libavcodec/mathops.h
  x86: move MANGLE() and related macros to libavutil/x86/asm.h
  x86: rename libavutil/x86_cpu.h to libavutil/x86/asm.h
  aacdec: Don't fall back to the old output configuration when no old configuration is present.
  rtmp: Add message tracking
  rtsp: Support mpegts in raw udp packets
  rtsp: Support receiving plain data over UDP without any RTP encapsulation
  rtpdec: Remove an unused include
  rtpenc: Remove an av_abort() that depends on user-supplied data
  vsrc_movie: discourage its use with avconv.
  avconv: allow no input files.
  avconv: prevent invalid reads in transcode_init()
  avconv: rename OutputStream.is_past_recording_time to finished.

Conflicts:
	configure
	doc/filters.texi
	ffmpeg.c
	ffmpeg.h
	libavcodec/Makefile
	libavcodec/aacdec.c
	libavcodec/mpegvideo.c
	libavformat/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-09 19:31:56 +02:00
Mans Rullgard
c318626ce2 x86: rename libavutil/x86_cpu.h to libavutil/x86/asm.h
This puts x86-specific things in the x86/ subdirectory where they
belong.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 00:58:20 +01:00
Michael Niedermayer
e776ee8f29 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavr: fix handling of custom mix matrices
  fate: force pix_fmt in lagarith-rgb32 test
  fate: add tests for lagarith lossless video codec.
  ARMv6: vp8: fix stack allocation with Apple's assembler
  ARM: vp56: allow inline asm to build with clang
  fft: 3dnow: fix register name typo in DECL_IMDCT macro
  x86: dct32: port to cpuflags
  x86: build: replace mmx2 by mmxext
  Revert "wmapro: prevent division by zero when sample rate is unspecified"
  wmapro: prevent division by zero when sample rate is unspecified
  lagarith: fix color plane inversion for YUY2 output.
  lagarith: pad RGB buffer by 1 byte.
  dsputil: make add_hfyu_left_prediction_sse4() support unaligned src.

Conflicts:
	doc/APIchanges
	libavcodec/lagarith.c
	libavfilter/x86/gradfun.c
	libavutil/cpu.h
	libavutil/version.h
	libswscale/utils.c
	libswscale/version.h
	libswscale/x86/yuv2rgb.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-04 23:51:43 +02:00
Diego Biurrun
239fdf1b4a x86: build: replace mmx2 by mmxext
Refactoring mmx2/mmxext YASM code with cpuflags will force renames.
So switching to a consistent naming scheme beforehand is sensible.
The name "mmxext" is more official and widespread and also the name
of the CPU flag, as reported e.g. by the Linux kernel.
2012-08-03 22:51:05 +02:00
Michael Niedermayer
2cb4d51654 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  v410dec: Implement explode mode support
  zerocodec: fix direct rendering.
  wav: init st to NULL to avoid a false-positive warning.
  wavpack: set bits_per_raw_sample for S32 samples to properly identify 24-bit
  h264: refactor NAL decode loop
  RTMPTE protocol support
  RTMPE protocol support
  rtmp: Add ff_rtmp_calc_digest_pos()
  rtmp: Rename rtmp_calc_digest to ff_rtmp_calc_digest and make it global
  swscale: add missing HAVE_INLINE_ASM check.
  lavfi: place x86 inline assembly under HAVE_INLINE_ASM.
  vc1: Add a test for interlaced field pictures
  swscale: Mark all init functions as av_cold
  swscale: x86: Drop pointless _mmx suffix from filenames
  lavf: use conditional notation for default codec in muxer declarations.
  swscale: place inline assembly bilinear scaler under HAVE_INLINE_ASM.
  dsputil: ppc: cosmetics: pretty-print
  dsputil: x86: add SHUFFLE_MASK_W macro
  configure: respect CC_O setting in check_cc

Conflicts:
	Changelog
	configure
	libavcodec/v410dec.c
	libavcodec/zerocodec.c
	libavformat/asfenc.c
	libavformat/version.h
	libswscale/utils.c
	libswscale/x86/swscale.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-23 21:25:09 +02:00
Diego Biurrun
5a6e3c039c swscale: Mark all init functions as av_cold 2012-07-23 01:30:05 +02:00
Michael Niedermayer
32c3038734 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: swscale: Place inline assembly code under appropriate #ifdefs
  rtsp: remove terminal comma in FF_RTP_FLAG_OPTS macro.
  configure: Remove redundant RTMPT/RTMPTS dependencies
  configure: add filtering of host cflags/ldflags
  configure: initialise all flag filters at the same place
  configure: add filtering of linker flags
  configure: name some variables more consistently
  configure: remove filter_cppflags
  configure: set icc_version where it is needed
  mpegenc: remove disabled code

Conflicts:
	configure
	libavformat/movenc.c
	libswscale/x86/swscale_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-22 05:05:02 +02:00
Ronald S. Bultje
b2668c85e9 x86: swscale: Place inline assembly code under appropriate #ifdefs
Fixes compilation for compilers that do not support gcc inline assembly.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-07-21 22:22:58 +02:00
Themaister
0827222b9c Use more accurate conversion for rgb15/16 to rgb24/32 (C/MMX).
Fate update by michael.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-11-09 01:58:22 +01:00
Michael Niedermayer
7a02527b05 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  ac3enc: use correct alignment and length in channel coupling dsp functions.
  ffmpeg: don't abuse a global for passing framerate from input to output
  ffmpeg: don't abuse a global for passing channels from input to output
  ffmpeg: don't abuse a global for passing samplerate from input to output
  ARM: update ff_h264_idct8_add4_neon for 4:4:4 changes
  swscale: use SwsContext for av_log when available
  swscale: Remove HAVE_MMX from files that are only compiled with MMX enabled.
  swscale: Fix compilation with --disable-mmx2.

Conflicts:
	ffmpeg.c
	libswscale/utils.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-16 03:53:58 +02:00
Diego Biurrun
a60466dbc3 swscale: Remove HAVE_MMX from files that are only compiled with MMX enabled. 2011-06-15 01:18:10 +02:00
Ronald S. Bultje
78046dadc3 rgb2rgb: remove duplicate mmx/mmx2/3dnow/sse2 functions.
Many functions have such a prefix, but do not actually use any
instructions or features from that set, thus giving the false
impression that swscale is highly optimized for a particular
system, whereas in reality it is not.
2011-05-28 11:41:32 +02:00
Ronald S. Bultje
522d65ba25 rgb2rgb: remove duplicate mmx/mmx2/3dnow/sse2 functions.
Many functions have such a prefix, but do not actually use any
instructions or features from that set, thus giving the false
impression that swscale is highly optimized for a particular
system, whereas in reality it is not.
2011-05-26 09:31:02 -04:00
Michael Niedermayer
034fc7bf12 Merge remote-tracking branch 'qatar/master'
* qatar/master: (22 commits)
  configure: enable memalign_hack automatically when needed
  swscale: unbreak the build on non-x86 systems.
  swscale: remove if(bitexact) branch from functions.
  swscale: remove if(canMMX2BeUsed) conditional.
  swscale: remove swScale_{c,MMX,MMX2} duplication.
  swscale: use emms_c().
  Move emms_c() from libavcodec to libavutil.
  tiff: set palette in the context when specified in TIFF_PAL tag
  rtsp: use strtoul to parse rtptime and seq values.
  pgssubdec: fix incorrect colors.
  dvdsubdec: fix incorrect colors.
  ape: Allow demuxing of files with metadata tags.
  swscale: remove dead macro WRITEBGR24OLD.
  swscale: remove AMD3DNOW "optimizations".
  swscale: remove duplicate code in ppc/ subdirectory.
  swscale: remove duplicated x86/ functions.
  swscale: force --enable-runtime-cpudetect and remove SWS_CPU_CAPS_*.
  vsrc_buffer.h: add file doxy
  vsrc_buffer: tweak error message in init()
  msmpeg4: reindent.
  ...

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-25 06:32:45 +02:00
Michael Niedermayer
d1adad3cca Merge swscale bloatup
This will be cleaned up in the next merge

Authorship / merged commits:
commit f668afd489
Author: Janne Grunau <janne-libav@jannau.net>
Date:   Fri Apr 15 09:12:34 2011 +0200

    swscale: fix "ISO C90 forbids mixed declarations and code" warning

    only hit with --enable-runtime-cpudetect

commit 7f2ae5c7af
Author: Janne Grunau <janne-libav@jannau.net>
Date:   Fri Apr 15 02:09:44 2011 +0200

    swscale: fix compilation with --enable-runtime-cpudetect

commit b6cad3df82
Author: Janne Grunau <janne-libav@jannau.net>
Date:   Fri Apr 15 00:31:04 2011 +0200

    swscale: correct include path to fix ppc altivec build

commit 6216fc70b7
Author: Luca Barbato <lu_zero@gentoo.org>
Date:   Thu Apr 14 22:03:45 2011 +0200

    swscale: simplify rgb2rgb templating

    MMX is always built. Drop the ifdefs

commit 33a0421bba
Author: Josh Allmann <joshua.allmann@gmail.com>
Date:   Wed Apr 13 20:57:32 2011 +0200

    swscale: simplify initialization code

    Simplify the fallthrough case when no accelerated functions
    can be initialized.

commit 735bf19511
Author: Josh Allmann <joshua.allmann@gmail.com>
Date:   Wed Apr 13 20:57:31 2011 +0200

    swscale: further cleanup swscale.c

    Move x86-specific constants out of swscale.c

commit 86330b4c92
Author: Luca Barbato <lu_zero@gentoo.org>
Date:   Wed Apr 13 20:57:30 2011 +0200

    swscale: partially move the arch specific code left

    PPC and x86 code is split off from swscale_template.c. Lots of code is
    still duplicated and should be removed later.

    Again uniformize the init system to be more similar to the dsputil one.

    Unset h*scale_fast in the x86 init in order to make the output
    consistent with the previous status. Thanks to Josh for spotting it.

commit c003832883
Author: Luca Barbato <lu_zero@gentoo.org>
Date:   Wed Apr 13 20:57:29 2011 +0200

    swscale: move away x86 specific code from rgb2rgb

    Keep only the plain C code in the main rgb2rgb.c and move the x86
    specific optimizations to x86/rgb2rgb.c
    Change the initialization pattern a little so some of it can be
    factorized to behave more like dsputils.

Conflicts:
	libswscale/rgb2rgb.c
	libswscale/swscale_template.c
2011-05-25 06:24:55 +02:00
Ronald S. Bultje
e66149e714 swscale: force --enable-runtime-cpudetect and remove SWS_CPU_CAPS_*. 2011-05-24 10:03:26 -04:00
Luca Barbato
6216fc70b7 swscale: simplify rgb2rgb templating
MMX is always built. Drop the ifdefs
2011-04-14 22:16:47 +02:00
Luca Barbato
c003832883 swscale: move away x86 specific code from rgb2rgb
Keep only the plain C code in the main rgb2rgb.c and move the x86
specific optimizations to x86/rgb2rgb.c
Change the initialization pattern a little so some of it can be
factorized to behave more like dsputils.
2011-04-14 22:16:47 +02:00