Commit Graph

43 Commits

Author SHA1 Message Date
Junxian Zhu 5ffe18bcea
mips: fix build fail on MIPS R6
Add macro define to avoid causing build fail with incompatible assembler code on MIPS R6.

Signed-off-by: Junxian Zhu <zhujunxian@oss.cipunited.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-03-26 01:46:39 +01:00
Shiyou Yin ab04fedaaa mips: Fix potential illegal instruction error.
MSA2 optimizations are attached to MSA macros in generic_macros_msa.h.
It's difficult to do runtime check for them. Remove this part of code
can make it more robust. H264 1080p decoding: 5.13x==>5.12x.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-05-07 17:53:23 +02:00
Shiyou Yin 0e0a9ca048 avutil/mips/generic_macros_msa: Fix prob that 'ulw' and 'uld' unsupported by clang.
GCC support these two synthesized instruction, but clang does not yet.
Use machine instruction instead to adapt clang compiler.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-07-30 00:23:45 +02:00
gxw 648b422e17 avcodec/mips: msa optimizations for vc1dsp
Performance of WMV3 decoding has speed up from 3.66x to 5.23x tested on 3A4000.

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-10-30 18:09:00 +01:00
gxw 92fc0bfa54 avutil/mips: refactor msa SLDI_Bn_0 and SLDI_Bn macros.
Changing details as following:
1. The previous order of parameters are irregular and difficult to
   understand. Adjust the order of the parameters according to the
   rule: (RTYPE, input registers, input mask/input index/..., output registers).
   Most of the existing msa macros follow the rule.
2. Remove the redundant macro SLDI_Bn_0 and use SLDI_Bn instead.

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-09-16 00:04:18 +02:00
Shiyou Yin e1039b09c4 avutil/mips: remove redundant code in TRANSPOSE16x8_UB_UB.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-08-15 01:26:21 +02:00
gxw a3e572d96f avutil/mips: refine msa macros CLIP_*.
Changing details as following:
1. Remove the local variable 'out_m' in 'CLIP_SH' and store the result in
   source vector.
2. Refine the implementation of macro 'CLIP_SH_0_255' and 'CLIP_SW_0_255'.
   Performance of VP8 decoding has speed up about 1.1%(from 7.03x to 7.11x).
   Performance of H264 decoding has speed up about 0.5%(from 4.35x to 4.37x).
   Performance of Theora decoding has speed up about 0.7%(from 5.79x to 5.83x).
3. Remove redundant macro 'CLIP_SH/Wn_0_255_MAX_SATU' and use 'CLIP_SH/Wn_0_255'
   instead, because there are no difference in the effect of this two macros.

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-08-13 16:48:38 +02:00
Shiyou Yin 153c607525 avutil/mips: refactor msa load and store macros.
Replace STnxm_UB and LDnxm_SH with new macros ST_{H/W/D}{1/2/4/8}.
The old macros are difficult to use because they don't follow the same parameter passing rules.
Changing details as following:
1. remove LD4x4_SH.
2. replace ST2x4_UB with ST_H4.
3. replace ST4x2_UB with ST_W2.
4. replace ST4x4_UB with ST_W4.
5. replace ST4x8_UB with ST_W8.
6. replace ST6x4_UB with ST_W2 and ST_H2.
7. replace ST8x1_UB with ST_D1.
8. replace ST8x2_UB with ST_D2.
9. replace ST8x4_UB with ST_D4.
10. replace ST8x8_UB with ST_D8.
11. replace ST12x4_UB with ST_D4 and ST_W4.

Examples of new macro: ST_H4(in, idx0, idx1, idx2, idx3, pdst, stride)
ST_H4 store four half-word elements in vector 'in' to pdst with stride.
About the macro name:
1) 'ST' means store operation.
2) 'H/W/D' means type of vector element is 'half-word/word/double-word'.
3) Number '1/2/4/8' means how many elements will be stored.
About the macro parameter:
1) 'in0, in1...' 128-bits vector.
2) 'idx0, idx1...' elements index.
3) 'pdst' destination pointer to store to
4) 'stride' stride of each store operation.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-07-19 01:23:23 +02:00
Shiyou Yin a45e8ade2d avutil/mips: optimize UNPCK&SAD macros with MSA2.0 instruction.
Loongson 3A4000 and 2k1000 has supported MSA2.0.
This patch optimized SAD_UB2_UH,UNPCK_R_SH_SW,UNPCK_SB_SH and UNPCK_SH_SW with MSA2.0 instruction.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-07-10 12:54:57 +02:00
Kaustubh Raste 736a48901f avcodec/mips: Improve hevc bi weighted hv mc msa functions
Use immediate unsigned saturation for clip to max saving one vector register.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-25 21:50:37 +02:00
Kaustubh Raste af9433b1d6 avcodec/mips: Improve avc bi-weighted mc msa functions
Replace generic with block size specific function.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-10 23:58:41 +02:00
Kaustubh Raste 10ab5534e0 avcodec/mips: Improve avc weighted mc msa functions
Replace generic with block size specific function.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-09-27 21:15:57 +02:00
Kaustubh Raste 7f8417f226 avcodec/mips: Improve hevc uni-w copy mc msa functions
Load the specific destination bytes instead of MSA load and pack.
Pack the data to half word before clipping.
Use immediate unsigned saturation for clip to max saving one vector register.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-09-24 02:33:48 +02:00
Kaustubh Raste 1a85fb7e1e avcodec/mips: Improve hevc sao band filter msa functions
Preload data in band filter 0-8 for better pipeline parallelization.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-09-15 22:36:42 +02:00
Kaustubh Raste 9b2c3c406f avcodec/mips: Improve vp9 mc msa functions
Load the specific destination bytes instead of MSA load and pack.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-09-08 13:48:40 +02:00
Kaustubh Raste a776cb2074 libavcodec/mips: Optimize avc idct 4x4 for msa
Removed memset call and improved performance.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-07-25 22:04:34 +02:00
Kaustubh Raste ef1b4bdf44 libavutil/mips: Updated msa generic macros
Reduced msa load-store code.
Removed inline asm of GP load-store for 64 bit.
Updated variable names in GP load-store macros for naming consistency.
Corrected macro descriptions.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-07-21 17:37:05 +02:00
Shivraj Patil c1cc13cd2a avutil/mips/generic_macros_msa: rename macro variable which causes segfault for mips r6
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-10-05 23:44:03 +02:00
Shivraj Patil fd7eadd25c avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for VP9 lpf functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-23 16:52:18 +02:00
Shivraj Patil d12f76ffbb avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for idctdsp functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for idctdsp functions in new file idctdsp_msa.c and simple_idct_msa.c

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-07-07 14:35:15 +02:00
Shivraj Patil 709bb45c66 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for me_cmp functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for me_cmp functions in new file me_cmp_msa.c

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-07-06 18:25:14 +02:00
Shivraj Patil 2f3f98af2b avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for mpegvideoencdsp functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for mpegvideoencdsp functions in new file mpegvideoencdsp_msa.c

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-07-06 18:25:01 +02:00
Shivraj Patil 2eb28e889d avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for mpegvideo functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for mpegvideo functions in new file mpegvideo_msa.c

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-07-01 17:32:47 +02:00
Shivraj Patil d9deae04a7 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for pixblock functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for pixblock functions in new file pixblockdsp_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-29 12:03:43 +02:00
Shivraj Patil ee3ef5fda2 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for hpel functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for hpel functions in new file hpeldsp_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-19 14:00:12 +02:00
Shivraj Patil 98eb1ac901 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for qpel functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for qpel functions in new file qpeldsp_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-18 12:33:15 +02:00
Shivraj Patil 178ba1fd03 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for AVC qpel functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC qpel functions in new file h264qpel_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Added const to local static array.

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-13 02:21:55 +02:00
Shivraj Patil fb92f3ecb4 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for AVC idct functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC idct functions in new file h264idct_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-11 17:10:45 +02:00
Shivraj Patil 1d70b6fe1d avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for AVC intra prediction functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC intra prediction functions in new file h264pred_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-11 17:10:41 +02:00
Shivraj Patil b87dc70c65 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for AVC chroma mc functions
s patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC chroma mc functions in new file h264chroma_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-11 12:24:02 +02:00
Shivraj Patil d6d98237ed avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC intra prediction functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC intra predition functions in new file hevcpred_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-10 13:53:03 +02:00
Shivraj Patil 271195f85b avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC loop filter and sao functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC loop filter and sao functions in new file hevc_lpf_sao_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

In this patch, in comparision with previous patch, duplicated c functions are removed.

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-10 13:14:50 +02:00
Shivraj Patil a34d902325 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC idct functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC idct functions in new file hevc_idct_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-04 18:39:53 +02:00
Shivraj Patil aef34ab950 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC uni mc epel functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC uni mc epel functions.
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-03 21:30:21 +02:00
Shivraj Patil ce1761db19 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC uniw mc functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC uniw mc functions (qpel as well as epel) in new file hevc_mc_uniw_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-03 13:46:47 +02:00
Shivraj Patil aede1a1a60 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC bi mc functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC bi mc functions (qpel as well as epel) in new file hevc_mc_bi_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Adds HEVC specific macros (needed for this patch) in libavcodec/mips/hevc_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-02 01:54:51 +02:00
Shivraj Patil 10b77fbf0d avcodec/mips: Split uni mc optimizations to new file
This patch moves HEVC code of uni mc cases to new file hevc_mc_uni_msa.c.
(There are total 5 sub-modules of HEVC mc functions, if we add all these modules in one single file, its size would be huge (~750k) & difficult to maintain, so splitting it in multiple files)
This patch also adds new HEVC header file libavcodec/mips/hevc_macros_msa.h

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-28 16:51:38 +02:00
Shivraj Patil 02a4991230 avutil/mips: Restructure of generic macros
This patch includes restructuring of existing macros and addition of more generic macros.

This change was necessary to avoid repeated review comments in remaining patches which we were about to submit.
Also this patch reduces number of code lines due to maximum use of generic macros, allows better code alignment & readability etc.

These modifications in commonly used .libavutil/mips/generic_macros_msa.h. impacts the already accepted code, hence re-submitting it in 2/4,3/4 & 4/4.
Overall, this patch set is just upgrading the code with styling changes and will bring it in sync with MIPS-SIMD optimized latest codebase at our end.

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-28 11:34:52 +02:00
Shivraj Patil 7174df44fe avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC uni copy, uni horizontal and uni vertical mc functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-07 16:47:34 +02:00
Shivraj Patil 02001ada5c avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for H264 lpf and weight/biweight functions
Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-01 04:19:18 +02:00
Shivraj Patil 97f074f134 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC copy and hv mc functions
Incorporated review comment.
Removed "__" from volatile.

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-04-24 17:12:04 +02:00
Michael Niedermayer 42d6d249b0 avutil/mips/generic_macros_msa: volatile doesnt need __
Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-04-20 14:39:50 +02:00
Shivraj Patil 4efc0e6451 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC horizontal and vertical mc functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-04-17 17:39:32 +02:00