Commit Graph

85 Commits

Author SHA1 Message Date
Andreas Rheinhardt c00cd007e8 configure: Remove av_restrict
All versions of MSVC that support C11 (namely >= v19.27)
also support the restrict keyword, therefore av_restrict
is no longer necessary since 75697836b1.

Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-15 12:51:15 +01:00
Junxian Zhu 5ffe18bcea
mips: fix build fail on MIPS R6
Add macro define to avoid causing build fail with incompatible assembler code on MIPS R6.

Signed-off-by: Junxian Zhu <zhujunxian@oss.cipunited.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-03-26 01:46:39 +01:00
Jiaxun Yang a1cd62883f avutil/mips: Use $at as MMI macro temporary register
Some function had exceed 30 inline assembly register oprands limiation
when using LOONGSON2 version of MMI macros. We can avoid that by take
$at, which is register reserved for assembler, as temporary register.

As none of instructions used in these macros is pseudo, it is safe to
utilize $at here.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-07-28 23:31:48 +02:00
Jiaxun Yang b868272d7e avutil/mips: Use MMI_{L, S}QC1 macro in {SAVE, RECOVER}_REG
{SAVE,RECOVER}_REG will be available for Loongson2 again,
also comment about the magic.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-07-28 23:31:48 +02:00
Jin Bo fd5fd48659 libavcodec/mips: Fix build errors reported by clang
Clang is more strict on the type of asm operands, float or double
type variable should use constraint 'f', integer variable should
use constraint 'r'.

Signed-off-by: Jin Bo <jinbo@loongson.cn>
Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-06-03 13:44:00 +02:00
Shiyou Yin ab04fedaaa mips: Fix potential illegal instruction error.
MSA2 optimizations are attached to MSA macros in generic_macros_msa.h.
It's difficult to do runtime check for them. Remove this part of code
can make it more robust. H264 1080p decoding: 5.13x==>5.12x.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-05-07 17:53:23 +02:00
Andreas Rheinhardt f3c197b129 Include attributes.h directly
Some files currently rely on libavutil/cpu.h to include it for them;
yet said file won't use include it any more after the currently
deprecated functions are removed, so include attributes.h directly.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-04-19 14:34:10 +02:00
Anton Khirnov c8c2dfbc37 lavu: move LOCAL_ALIGNED from internal.h to mem_internal.h
That is a more appropriate place for it.
2021-01-01 14:11:01 +01:00
Shiyou Yin 0e0a9ca048 avutil/mips/generic_macros_msa: Fix prob that 'ulw' and 'uld' unsupported by clang.
GCC support these two synthesized instruction, but clang does not yet.
Use machine instruction instead to adapt clang compiler.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-07-30 00:23:45 +02:00
Jiaxun Yang e387fcd01c libavutil: Detect MMI and MSA flags for MIPS
Add MMI & MSA runtime detection for MIPS.

Basically there are two code pathes. For systems that
natively support CPUCFG instruction or kernel emulated
that instruction, we'll sense this feature from HWCAP and
report the flags according to values grab from CPUCFG. For
systems that have no CPUCFG (or not export it in HWCAP),
we'll parse /proc/cpuinfo instead.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-07-23 17:21:58 +02:00
Jiaxun Yang d5380f068d libavutils: Add parse_r helper for MIPS
That helper grab from kernel code can allow us to inline
newer instructions (not implemented by the assembler) in
a elegant manner.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-07-23 17:17:05 +02:00
gxw 648b422e17 avcodec/mips: msa optimizations for vc1dsp
Performance of WMV3 decoding has speed up from 3.66x to 5.23x tested on 3A4000.

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-10-30 18:09:00 +01:00
gxw 92fc0bfa54 avutil/mips: refactor msa SLDI_Bn_0 and SLDI_Bn macros.
Changing details as following:
1. The previous order of parameters are irregular and difficult to
   understand. Adjust the order of the parameters according to the
   rule: (RTYPE, input registers, input mask/input index/..., output registers).
   Most of the existing msa macros follow the rule.
2. Remove the redundant macro SLDI_Bn_0 and use SLDI_Bn instead.

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-09-16 00:04:18 +02:00
Shiyou Yin e1039b09c4 avutil/mips: remove redundant code in TRANSPOSE16x8_UB_UB.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-08-15 01:26:21 +02:00
gxw a3e572d96f avutil/mips: refine msa macros CLIP_*.
Changing details as following:
1. Remove the local variable 'out_m' in 'CLIP_SH' and store the result in
   source vector.
2. Refine the implementation of macro 'CLIP_SH_0_255' and 'CLIP_SW_0_255'.
   Performance of VP8 decoding has speed up about 1.1%(from 7.03x to 7.11x).
   Performance of H264 decoding has speed up about 0.5%(from 4.35x to 4.37x).
   Performance of Theora decoding has speed up about 0.7%(from 5.79x to 5.83x).
3. Remove redundant macro 'CLIP_SH/Wn_0_255_MAX_SATU' and use 'CLIP_SH/Wn_0_255'
   instead, because there are no difference in the effect of this two macros.

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-08-13 16:48:38 +02:00
Shiyou Yin 11f99a9a45 avutil/mips: Avoid instruction exception caused by gssqc1/gslqc1.
Ensure the address accesed by gssqc1/gslqc1 are 16-byte aligned.
2019-08-02 19:01:51 +02:00
Shiyou Yin 153c607525 avutil/mips: refactor msa load and store macros.
Replace STnxm_UB and LDnxm_SH with new macros ST_{H/W/D}{1/2/4/8}.
The old macros are difficult to use because they don't follow the same parameter passing rules.
Changing details as following:
1. remove LD4x4_SH.
2. replace ST2x4_UB with ST_H4.
3. replace ST4x2_UB with ST_W2.
4. replace ST4x4_UB with ST_W4.
5. replace ST4x8_UB with ST_W8.
6. replace ST6x4_UB with ST_W2 and ST_H2.
7. replace ST8x1_UB with ST_D1.
8. replace ST8x2_UB with ST_D2.
9. replace ST8x4_UB with ST_D4.
10. replace ST8x8_UB with ST_D8.
11. replace ST12x4_UB with ST_D4 and ST_W4.

Examples of new macro: ST_H4(in, idx0, idx1, idx2, idx3, pdst, stride)
ST_H4 store four half-word elements in vector 'in' to pdst with stride.
About the macro name:
1) 'ST' means store operation.
2) 'H/W/D' means type of vector element is 'half-word/word/double-word'.
3) Number '1/2/4/8' means how many elements will be stored.
About the macro parameter:
1) 'in0, in1...' 128-bits vector.
2) 'idx0, idx1...' elements index.
3) 'pdst' destination pointer to store to
4) 'stride' stride of each store operation.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-07-19 01:23:23 +02:00
Shiyou Yin a45e8ade2d avutil/mips: optimize UNPCK&SAD macros with MSA2.0 instruction.
Loongson 3A4000 and 2k1000 has supported MSA2.0.
This patch optimized SAD_UB2_UH,UNPCK_R_SH_SW,UNPCK_SB_SH and UNPCK_SH_SW with MSA2.0 instruction.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-07-10 12:54:57 +02:00
gxw 4571c7c05d avcodec/mips: [loongson] mmi optimizations for VP9 put and avg functions
VP9 decoding speed improved about 60.5%(from 38fps to 61fps, tested on loongson 3A3000).

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-02-27 01:51:40 +01:00
Shiyou Yin 6d19164811 avcodec/mips: [loongson] optimize put_hevc_qpel_hv_8 with mmi.
Optimize put_hevc_qpel_hv_8 with mmi in the case width=4/8/12/16/24/32/48/64.
This optimization improved HEVC decoding performance 11%(1.81x to 2.01x, tested on loongson 3A3000).

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-01-22 00:46:36 +01:00
Shiyou Yin 5161f7bcfd avutil/mips: [loongson] simplify macro TRANSPOSE_4H and TRANSPOSE_8B
Simplify macro TRANSPOSE_4H in mmiutils.h and add TRANSPOSE_8B as a common macro.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-09 12:01:07 +02:00
gxw 090647da84 avcodec/mips: [loongson] optimize vp8 decoding in vp8dsp.
Optimize vp8 loop filter with mmi, four functions optimized:
1. ff_vp8_h_loop_filter8uv_mmi.
2. ff_vp8_v_loop_filter8uv_mmi.
3. ff_vp8_h_loop_filter16_mmi.
4. ff_vp8_v_loop_filter16_mmi.

Vp8 decoding speed improved about 50%(from 73fps to 110fps, Tested on loongson 3A3000).

Signed-off-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-09 12:01:07 +02:00
Shiyou Yin df13b75aa1 avcodec/mips: [loongson] reoptimize simple idct with mmi.
Performance of mpeg4 decoding improved about 23%(from 128fps to 158fps, tested on loongson 3A3000).
Reoptimized following functions with mmi.
1. ff_simple_idct_put_8_mmi
2. ff_simple_idct_add_8_mmi
3. ff_simple_idct_8_mmi

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-09-02 03:37:32 +02:00
Kaustubh Raste 736a48901f avcodec/mips: Improve hevc bi weighted hv mc msa functions
Use immediate unsigned saturation for clip to max saving one vector register.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-25 21:50:37 +02:00
Kaustubh Raste af9433b1d6 avcodec/mips: Improve avc bi-weighted mc msa functions
Replace generic with block size specific function.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-10 23:58:41 +02:00
Kaustubh Raste 10ab5534e0 avcodec/mips: Improve avc weighted mc msa functions
Replace generic with block size specific function.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-09-27 21:15:57 +02:00
Kaustubh Raste 7f8417f226 avcodec/mips: Improve hevc uni-w copy mc msa functions
Load the specific destination bytes instead of MSA load and pack.
Pack the data to half word before clipping.
Use immediate unsigned saturation for clip to max saving one vector register.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-09-24 02:33:48 +02:00
Kaustubh Raste 1a85fb7e1e avcodec/mips: Improve hevc sao band filter msa functions
Preload data in band filter 0-8 for better pipeline parallelization.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-09-15 22:36:42 +02:00
Kaustubh Raste 9b2c3c406f avcodec/mips: Improve vp9 mc msa functions
Load the specific destination bytes instead of MSA load and pack.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-09-08 13:48:40 +02:00
Kaustubh Raste a776cb2074 libavcodec/mips: Optimize avc idct 4x4 for msa
Removed memset call and improved performance.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-07-25 22:04:34 +02:00
Kaustubh Raste ef1b4bdf44 libavutil/mips: Updated msa generic macros
Reduced msa load-store code.
Removed inline asm of GP load-store for 64 bit.
Updated variable names in GP load-store macros for naming consistency.
Corrected macro descriptions.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-07-21 17:37:05 +02:00
Zhou Xiaoyong b9cd922660 avutil/mips: loongson add mmi utils header file
1.mmiutils.h defined MMI_ load/store macros for loongson2e/2f/3a
2.mmiutils.h defined some mmi assembly macors

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-10-23 03:23:09 +02:00
Shivraj Patil c1cc13cd2a avutil/mips/generic_macros_msa: rename macro variable which causes segfault for mips r6
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-10-05 23:44:03 +02:00
ZhouXiaoyong d680ab1c46 avutil/mips: header asmdefs.h add some PTR_ macros for loongson
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-05-14 04:46:52 +02:00
Vicente Olivert Riera ad16eff64b mips: add support for R6
Understanding the mips32r6 and mips64r6 ISAs in the configure script is
not enough. In order to have full support for MIPS R6 in FFmpeg we need
to be able to build it, and for that we need to make sure we don't use
incompatible assembler code which makes the build fail. Ifdefing the
offending code is sufficient to fix the problem.

Signed-off-by: Vicente Olivert Riera <Vincent.Riera@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-03-09 20:05:04 +01:00
Timothy Gu 180f9a0958 all: Make header guard names consistent 2016-01-31 15:44:11 -08:00
Shivraj Patil fd7eadd25c avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for VP9 lpf functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-23 16:52:18 +02:00
Shivraj Patil d12f76ffbb avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for idctdsp functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for idctdsp functions in new file idctdsp_msa.c and simple_idct_msa.c

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-07-07 14:35:15 +02:00
Shivraj Patil 709bb45c66 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for me_cmp functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for me_cmp functions in new file me_cmp_msa.c

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-07-06 18:25:14 +02:00
Shivraj Patil 2f3f98af2b avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for mpegvideoencdsp functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for mpegvideoencdsp functions in new file mpegvideoencdsp_msa.c

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-07-06 18:25:01 +02:00
Shivraj Patil 2eb28e889d avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for mpegvideo functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for mpegvideo functions in new file mpegvideo_msa.c

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-07-01 17:32:47 +02:00
Shivraj Patil d9deae04a7 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for pixblock functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for pixblock functions in new file pixblockdsp_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-29 12:03:43 +02:00
Shivraj Patil ee3ef5fda2 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for hpel functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for hpel functions in new file hpeldsp_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-19 14:00:12 +02:00
Shivraj Patil 98eb1ac901 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for qpel functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for qpel functions in new file qpeldsp_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-18 12:33:15 +02:00
Shivraj Patil 178ba1fd03 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for AVC qpel functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC qpel functions in new file h264qpel_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Added const to local static array.

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-13 02:21:55 +02:00
Shivraj Patil fb92f3ecb4 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for AVC idct functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC idct functions in new file h264idct_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-11 17:10:45 +02:00
Shivraj Patil 1d70b6fe1d avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for AVC intra prediction functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC intra prediction functions in new file h264pred_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-11 17:10:41 +02:00
Shivraj Patil b87dc70c65 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for AVC chroma mc functions
s patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC chroma mc functions in new file h264chroma_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-11 12:24:02 +02:00
Shivraj Patil d6d98237ed avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC intra prediction functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC intra predition functions in new file hevcpred_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-10 13:53:03 +02:00
Shivraj Patil 271195f85b avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC loop filter and sao functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC loop filter and sao functions in new file hevc_lpf_sao_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

In this patch, in comparision with previous patch, duplicated c functions are removed.

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-10 13:14:50 +02:00