Commit Graph

82 Commits

Author SHA1 Message Date
Rémi Denis-Courmont b6585eb04c lavu: add/use flag for RISC-V Zba extension
The code was blindly assuming that Zbb or V implied Zba. While the
earlier is practically always true, the later broke some QEMU setups,
as V was introduced earlier than Zba.
2023-07-19 19:29:35 +03:00
Martin Storsjö 397cb623c8 aarch64: Add cpu flags for the dotprod and i8mm extensions
Set these available if they are available unconditionally for
the compiler.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-06-06 12:40:42 +03:00
Rémi Denis-Courmont 37d5ddc317 lavu/riscv: CPU flag for the Zbb extension
Unfortunately, it is common, and will remain so, that the Bit
manipulations are not enabled at compilation time. This is an official
policy for Debian ports in general (though they do not support RISC-V
officially as of yet) to stick to the minimal target baseline, which
does not include the B extension or even its Zbb subset.

For inline helpers (CPOP, REV8), compiler builtins (CTZ, CLZ) or
even plain C code (MIN, MAX, MINU, MAXU), run-time detection seems
impractical. But at least it can work for the byte-swap DSP functions.
2022-10-05 08:26:19 +02:00
Rémi Denis-Courmont 0c0a3deb18 lavu/cpu: CPU flags for the RISC-V Vector extension
RVV defines a total of 12 different extensions, including:

- 5 different instruction subsets:
  - Zve32x: 8-, 16- and 32-bit integers,
  - Zve32f: Zve32x plus single precision floats,
  - Zve64x: Zve32x plus 64-bit integers,
  - Zve64f: Zve32f plus Zve64x,
  - Zve64d: Zve64f plus double precision floats.

- 6 different vector lengths:
  - Zvl32b (embedded only),
  - Zvl64b (embedded only),
  - Zvl128b,
  - Zvl256b,
  - Zvl512b,
  - Zvl1024b,

- and the V extension proper: equivalent to Zve64f and Zvl128b.

In total, there are 6 different possible sets of supported instructions
(including the empty set), but for convenience we allocate one bit for
each type sets: up-to-32-bit ints (RVV_I32), floats (RVV_F32),
64-bit ints (RVV_I64) and doubles (RVV_F64).

Whence the vector size is needed, it can be retrieved by reading the
unprivileged read-only vlenb CSR. This should probably be a separate
helper macro if needed at a later point.
2022-09-27 13:19:52 +02:00
Rémi Denis-Courmont b95e2fbd85 lavu/cpu: detect RISC-V base extensions
This introduces compile-time and run-time CPU detection on RISC-V. In
practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of
I, F and D extensions, and if it does, it probably won't have run-time
detection. So the flags are essentially always set.

But as things stand, checkasm wants them that way. Compare the ARMV8
flag on AArch64. We are nowhere near running short on CPU flag bits.
2022-09-27 13:19:52 +02:00
Wu Jianhua f629ea2e18 avutil/cpu: add AVX512 Icelake flag
Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
Reviewed-by: Henrik Gramner <henrik@gramner.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2022-03-10 16:45:48 -03:00
Alan Kelly ffbab99f2c libavutil/cpu: Add AV_CPU_FLAG_SLOW_GATHER.
This flag is set on Haswell and earlier and all AMD cpus.
2021-12-21 17:44:44 -03:00
Shiyou Yin 9a840ffa17 avutil: [loongarch] Add support for loongarch SIMD.
LSX and LASX is loongarch SIMD extention.
They are enabled by default if compiler support it, and can be disabled
with '--disable-lsx' '--disable-lasx'.

Change-Id: Ie2608ea61dbd9b7fffadbf0ec2348bad6c124476
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-12-15 18:37:40 +01:00
Thilo Borgmann c1bf56a526 lavu/cpu: Use av_cpu_ prefix 2021-07-20 10:31:41 +02:00
Thilo Borgmann 87951dcbe7 lavu/cpu.c: Add av_force_cpu_count() to override auto-detection. 2021-07-16 10:06:10 +02:00
Andreas Rheinhardt d40bb518b5 avutil/cpu: Remove deprecated functions
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 10:43:13 -03:00
Andreas Rheinhardt 7368e5537d avutil/cpu: Schedule deprecated functions for removal
av_set_cpu_flags_mask() has been deprecated in the commit which merged
it: 6df42f98746be06c883ce683563e07c9a2af983f; av_parse_cpu_flags() has
been deprecated in 4b529edff8.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-04-19 14:34:19 +02:00
Jiaxun Yang e387fcd01c libavutil: Detect MMI and MSA flags for MIPS
Add MMI & MSA runtime detection for MIPS.

Basically there are two code pathes. For systems that
natively support CPUCFG instruction or kernel emulated
that instruction, we'll sense this feature from HWCAP and
report the flags according to values grab from CPUCFG. For
systems that have no CPUCFG (or not export it in HWCAP),
we'll parse /proc/cpuinfo instead.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-07-23 17:21:58 +02:00
James Darnley 8b81eabe57 avutil: add AVX-512 flags 2017-12-24 22:02:41 +01:00
James Almer 522f877086 Merge commit 'e6bff23f1e11aefb16a2b5d6ee72bf7469c5a66e'
* commit 'e6bff23f1e11aefb16a2b5d6ee72bf7469c5a66e':
  cpu: add a function for querying maximum required data alignment

Adapted to work with the arbitrary runtime cpuflag changes av_force_cpu_flags()
can generate.

Merged-by: James Almer <jamrial@gmail.com>
2017-09-27 23:03:57 -03:00
Anton Khirnov e6bff23f1e cpu: add a function for querying maximum required data alignment 2017-02-11 11:37:45 +01:00
James Almer 2eab48177d Merge commit '7d7355aa92bb36ca0765c49a569a999bcb96f332'
* commit '7d7355aa92bb36ca0765c49a569a999bcb96f332':
  x86: Add SSSE3_SLOW CPU flag and related convenience macros

Merged-by: James Almer <jamrial@gmail.com>
2017-01-31 15:17:19 -03:00
Wan-Teh Chang fed50c4304 avutil: fix data race in av_get_cpu_flags()
Make the one-time initialization in av_get_cpu_flags() thread-safe. The
static variable |cpu_flags| in libavutil/cpu.c is read and written using
normal load and store operations. These are considered as data races.
The fix is to use atomic load and store operations.

The fix can be verified by running the libavutil/tests/cpu_init.c test
program under ThreadSanitizer:
    ./configure --toolchain=clang-tsan
    make libavutil/tests/cpu_init
    libavutil/tests/cpu_init

There should be no warnings from ThreadSanitizer.

Co-author: Dmitry Vyukov of Google, who suggested the data race fix.

Signed-off-by: Wan-Teh Chang <wtc@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-12-13 00:07:20 +01:00
Wan-Teh Chang 2170017a1c avutil: fix data race in av_get_cpu_flags()
Make the one-time initialization in av_get_cpu_flags() thread-safe. The
static variables |flags|, |cpuflags_mask|, and |checked| in
libavutil/cpu.c are read and written using normal load and store
operations. These are considered as data races. The fix is to use atomic
load and store operations.

Remove the |checked| variable because the invalid value of -1 for
|flags| can be used to indicate the same condition. Rename |flags| to
|cpu_flags| and move it to file scope.

The fix can be verified by running the libavutil/tests/cpu_init.c test
program under ThreadSanitizer:
    ./configure --toolchain=clang-tsan
    make libavutil/tests/cpu_init
    libavutil/tests/cpu_init

There should be no warnings from ThreadSanitizer.

Co-author: Dmitry Vyukov of Google, who suggested the data race fix.

Signed-off-by: Wan-Teh Chang <wtc@google.com>
2016-12-08 15:53:58 -05:00
Diego Biurrun 7d7355aa92 x86: Add SSSE3_SLOW CPU flag and related convenience macros 2016-07-20 18:43:28 +02:00
Lou Logan 06eef96b69 fix some a/an typos
Signed-off-by: Lou Logan <lou@lrcd.com>
2016-03-28 14:13:17 -08:00
Hendrik Leppkes e754c8e8ca Merge commit 'e2710e790c09e49e86baa58c6063af0097cc8cb0'
* commit 'e2710e790c09e49e86baa58c6063af0097cc8cb0':
  arm: add a cpu flag for the VFPv2 vector mode

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-02 11:01:29 +01:00
Janne Grunau e2710e790c arm: add a cpu flag for the VFPv2 vector mode
The vector mode was deprecated in ARMv7-A/VFPv3 and various cpu
implementations do not support it in hardware. Vector mode code will
depending the OS either be emulated in software or result in an illegal
instruction on cpus which does not support it. This was not really
problem in practice since NEON implementations of the same functions are
preferred. It will however become a problem for checkasm which tests
every cpu flag separately.

Since this is a cpu feature newer cpu do not support anymore the
behaviour of this flag differs from the other flags. It can be only
activated by runtime cpu feature selection.
2015-12-14 16:42:35 +01:00
Rodger Combs 1e477a970f lavu: add AESNI CPU flag 2015-10-28 04:23:14 -05:00
Hendrik Leppkes f1b02e6ca8 lavu/cpu: remove old cmov cruft 2015-09-05 17:23:28 +02:00
Vittorio Giovara bf7114b6ca lavu: Drop deprecated AV_CPU_FLAG_MMX2 symbol
Deprecated in 11/2012.
2015-08-28 16:04:27 +02:00
Michael Niedermayer 58a4204873 Merge commit '7d07ee5a9bd170a06d26fd967cf8de5d3b1ce331'
* commit '7d07ee5a9bd170a06d26fd967cf8de5d3b1ce331':
  ppc: cpu: Add support for VSX and POWER8 extensions

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-31 22:47:50 +02:00
Luca Barbato 7d07ee5a9b ppc: cpu: Add support for VSX and POWER8 extensions 2015-05-31 12:07:11 +02:00
James Almer f7cafb5d02 x86: add AV_CPU_FLAG_AVXSLOW flag
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2015-05-31 12:07:11 +02:00
James Almer c312bfac4c x86/cpu: add AV_CPU_FLAG_AVXSLOW flag
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-05-27 03:31:11 -03:00
Michael Niedermayer 1e519b9d40 avutil: turn arm setend into a cpuflag
this allows disabling and enabling it
it also prevents crashes if vfpv3 and neon are disabled which previously
would have enabled the flag

And last but not least one can enable setend on cpus like cortex-a8 where
its fast but disabled by default

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-13 14:50:15 +02:00
Michael Niedermayer 4c57c6a765 Merge commit '8675bcb0addb1c7fb0b04682d1f3f95d5b8dae14'
* commit '8675bcb0addb1c7fb0b04682d1f3f95d5b8dae14':
  aarch64: add armv8 CPU flag

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-07 02:15:18 +02:00
Janne Grunau 8675bcb0ad aarch64: add armv8 CPU flag 2014-04-06 21:18:49 +02:00
James Almer d59fcdaff3 x86: add detection for Bit Manipulation Instruction sets
Based on x264 code

Signed-off-by: James Almer <jamrial@gmail.com>
2014-02-23 15:29:36 +01:00
James Almer 1b932eb150 x86: add detection for FMA3 instruction set
Based on x264 code

Signed-off-by: James Almer <jamrial@gmail.com>
2014-02-23 15:29:36 +01:00
James Almer 0bc3de19ff x86: add detection for Bit Manipulation Instruction sets
Based on x264 code

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-22 17:26:00 +01:00
James Almer a2af8eddab x86: add detection for FMA3 instruction set
Based on x264 code

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-22 17:25:52 +01:00
Michael Niedermayer a665704402 Merge commit '4d6ee0725553a43ba88d6f8327ebcf8f1c5ae8d4'
* commit '4d6ee0725553a43ba88d6f8327ebcf8f1c5ae8d4':
  libavutil: x86: Add AVX2 capable CPU detection.

Conflicts:
	libavutil/cpu.c
	libavutil/cpu.h
	libavutil/x86/cpu.c

See: 865b70bc5d
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-26 02:36:36 +02:00
Kieran Kunhya 865b70bc5d Add AVX2 capable CPU detection. Patch based on x264's AVX2 detection
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-26 02:34:22 +02:00
Kieran Kunhya 4d6ee07255 libavutil: x86: Add AVX2 capable CPU detection.
Patch based on x264's AVX2 detection

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-25 19:36:55 +01:00
Michael Niedermayer 1e67800780 Merge commit '80fefbed623491b92fe59ead99225f99c0d0ca08'
* commit '80fefbed623491b92fe59ead99225f99c0d0ca08':
  x86: cpu: Restore some explanatory comments removed in 7160bb7

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-04 11:28:21 +02:00
Diego Biurrun 80fefbed62 x86: cpu: Restore some explanatory comments removed in 7160bb7 2013-10-03 23:00:09 +02:00
Michael Niedermayer c83d794936 Merge commit 'b78b10c4b78b696927f2801cf2d9f193b4eff28b'
* commit 'b78b10c4b78b696927f2801cf2d9f193b4eff28b':
  avutil: Move internal CPU detection function declarations to private header

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 14:05:15 +02:00
Diego Biurrun b78b10c4b7 avutil: Move internal CPU detection function declarations to private header 2013-08-28 23:54:14 +02:00
Michael Niedermayer fe40a9f98f Merge commit '2a6eaeaa85d17b27ee0dd449183ec197c35c9675'
* commit '2a6eaeaa85d17b27ee0dd449183ec197c35c9675':
  Move get_logical_cpus() from lavc/pthread to lavu/cpu.

Conflicts:
	doc/APIchanges
	libavcodec/pthread.c
	libavutil/cpu.c
	libavutil/cpu.h
	libavutil/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-24 13:24:28 +02:00
Anton Khirnov 2a6eaeaa85 Move get_logical_cpus() from lavc/pthread to lavu/cpu.
It will be useful in lavfi, and could conceivably be useful to the user
applications as well.
2013-05-24 09:28:00 +02:00
Janne Grunau 8f5587c3d0 cpu.h: define AV_CPU_FLAG_MMX2 for libavutil major 52 2012-11-16 15:04:48 +01:00
Michael Niedermayer 64604e2679 cpu: improve av_get_cpu_flags() doxy
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-01 16:41:37 +02:00
Michael Niedermayer 4b529edff8 deprecate av_parse_cpu_flags
This function is problematic in several ways, its also quite
unpredictable which flags it ends up turning on

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-16 22:04:46 +02:00
Michael Niedermayer e776ee8f29 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavr: fix handling of custom mix matrices
  fate: force pix_fmt in lagarith-rgb32 test
  fate: add tests for lagarith lossless video codec.
  ARMv6: vp8: fix stack allocation with Apple's assembler
  ARM: vp56: allow inline asm to build with clang
  fft: 3dnow: fix register name typo in DECL_IMDCT macro
  x86: dct32: port to cpuflags
  x86: build: replace mmx2 by mmxext
  Revert "wmapro: prevent division by zero when sample rate is unspecified"
  wmapro: prevent division by zero when sample rate is unspecified
  lagarith: fix color plane inversion for YUY2 output.
  lagarith: pad RGB buffer by 1 byte.
  dsputil: make add_hfyu_left_prediction_sse4() support unaligned src.

Conflicts:
	doc/APIchanges
	libavcodec/lagarith.c
	libavfilter/x86/gradfun.c
	libavutil/cpu.h
	libavutil/version.h
	libswscale/utils.c
	libswscale/version.h
	libswscale/x86/yuv2rgb.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-04 23:51:43 +02:00