Commit Graph

43 Commits

Author SHA1 Message Date
Michael Niedermayer 65e33d8e23 swresample/resample_template: Add filter values in parallel
This is faster 2871 -> 2189  cycles for int16 matrixbench -> 23456hz
Fixes a integer overflow in a artificial corner case
Fixes part of 668007-media

Found-by: Matt Wolenetz <wolenetz@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-12-10 02:44:21 +01:00
Michael Niedermayer 34db650784 swresample/resample_template: Reorder operations to avoid one addition
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-12-10 02:05:17 +01:00
Muhammad Faiz b8c6e5a661 swresample: add exact_rational option
give high quality resampling
as good as with linear_interp=on
as fast as without linear_interp=on
tested visually with ffplay
ffplay -f lavfi "aevalsrc='sin(10000*t*t)', aresample=osr=48000, showcqt=gamma=5"
ffplay -f lavfi "aevalsrc='sin(10000*t*t)', aresample=osr=48000:linear_interp=on, showcqt=gamma=5"
ffplay -f lavfi "aevalsrc='sin(10000*t*t)', aresample=osr=48000:exact_rational=on, showcqt=gamma=5"

slightly speed improvement
for fair comparison with -cpuflags 0
audio.wav is ~ 1 hour 44100 stereo 16bit wav file
ffmpeg -i audio.wav -af aresample=osr=48000 -f null -
        old         new
real    13.498s     13.121s
user    13.364s     12.987s
sys      0.131s      0.129s

linear_interp=on
        old         new
real    23.035s     23.050s
user    22.907s     22.917s
sys      0.119s     0.125s

exact_rational=on
real    12.418s
user    12.298s
sys      0.114s

possibility to decrease memory usage if soft compensation is ignored

Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
2016-06-13 12:36:01 +07:00
James Almer 43482bd1a5 swr/resample: use av_clip functions
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-04-05 22:46:40 +02:00
Michael Niedermayer 0cb95f9082 swresample/resample_template: Add () to protect the arguments of the OUT() macro
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-17 00:36:35 +01:00
James Almer 857cd1f33b swr: initialize only the necessary resample dsp functions
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-04 01:37:41 +02:00
James Almer 23a9edf531 Partially revert "swr: add prototypes for resample dsp functions"
Prototypes are not needed anymore now that the x86 functions don't
include resample_template.c

The DO_RESAMPLE_ONE macro is removed for that same reason as well.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-02 03:14:36 +02:00
James Almer dd2c9034b1 x86/swr: convert resample_{common, linear}_double_sse2 to yasm
Signed-off-by: James Almer <jamrial@gmail.com>

312531 -> 311528 dezicycles

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-01 17:57:36 +02:00
Ronald S. Bultje 847bb638c0 swr: convert resample_common/linear_int16_mmx2/sse2 to yasm.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-30 20:11:50 +02:00
Michael Niedermayer 418e5768c6 swresample/resample_template: move division out of loop for float/double swri_resample_linear()
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-30 04:30:10 +02:00
Michael Niedermayer c5a405c4f0 swresample/resample_template: flip order of operations in swri_resample_linear() for 32bit
Fixes integer overflow

Found-by: BBB
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-29 22:19:57 +02:00
Ronald S. Bultje faa1471ffc swr: rewrite resample_common/linear_float_sse/avx in yasm.
Linear interpolation goes from 63 (llvm) or 58 (gcc) to 48 (yasm)
cycles/sample on 64bit, or from 66 (llvm/gcc) to 52 (yasm) cycles/
sample on 32bit. Bon-linear goes from 43 (llvm) or 38 (gcc) to
32 (yasm) cycles/sample on 64bit, or from 46 (llvm) or 44 (gcc) to
38 (yasm) cycles/sample on 32bit (all testing on OSX 10.9.2, llvm
5.1 and gcc 4.8/9).

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-28 17:06:47 +02:00
Ronald S. Bultje 0dae193d3e swr: remove another forgotten division in DSP function.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-22 05:20:22 +02:00
Ronald S. Bultje cbf21628a5 swr: remove div/mod from DSP functions.
Also fix a bug with resample_compensation resetting dst_incr.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-18 14:15:52 +02:00
Ronald S. Bultje edf930472b swr: reindent.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-16 01:33:32 +02:00
James Almer 7f4dfbd080 swr: add prototypes for resample dsp functions
Should fix compilation failures with MSVC and any other compiler
without inline asm support.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-15 01:33:17 +02:00
Ronald S. Bultje 7128a35f8c swr: split out DSP functions.
DSP bits of swri_resample go into their own mini-DSP functions; DSP
init goes from a per-call branch in multiple_resample to a proper
DSP init routine; x86 bits go into x86/; swri_resample() moves out of
resample_template.c into resample.c because it's independent of DSP
code or sample type; multiple_resample() is simplified.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-14 20:21:39 +02:00
Ronald S. Bultje b785c62681 swr: handle initial negative sample index outside DSP function.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-14 14:36:18 +02:00
Ronald S. Bultje 6b9685de3a swr: remove unnecessary assignment.
I don't see dst_incr/dst_incr_frac ever being changed from their
initial value (which is the inverse of this operation), so it seems
to me that this is a no-op.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-14 04:30:53 +02:00
Ronald S. Bultje f341340552 swr: handle 64bit overflow check in multiple_resample().
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-09 15:24:51 +02:00
Ronald S. Bultje cdfd9717ed swr: move compensation_distance handling to swri_resample caller.
I think there's an off-by-one in terms of the switchpoint where we
switch from dst_incr to ideal_dst_incr, I don't think that's a massive
issue, but just be aware of that. It's probably trivial to prevent but
I don't care.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

I could not reproduce any off by 1 error, results are bit exact (michael)
2014-06-02 15:06:24 +02:00
Michael Niedermayer 2c23f87c85 swr/resample_template: prevent end_index from overflowing and add check for delta_frac overflow
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-02 00:57:08 +02:00
Ronald S. Bultje 9b53853756 Rewrite main resampling loop (common and linear).
This removes a branch at a performance-sensitive point (in the middle
of the loop). In fate-swr-resample-s32p-8000-2626, this makes the code
about 10% faster. It also simplifies the loops, allowing us to rewrite
it in yasm at some later point.

The compensation_distance != 0 code and index < 0 code are still kind
of hairy. For compensation_distance != 0, this should likely be handled
in the caller, so that it calls swri_resample twice (once until the
dst_incr switch-point, and once with the remainder of the samples). For
index < 0, the code should probably be rewritten to break out of the
loop once sample_index >= 0, and then resume (e.g. as a tail-call) to
the common or linear resampling loops.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-02 00:47:54 +02:00
James Almer a9bf713d35 swresample: add swri_resample_float_avx
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-16 05:27:03 +02:00
James Almer cdac3ab59f swresample: add swri_resample_double_sse2
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-25 16:46:07 +02:00
Michael Niedermayer 2b58c9c945 swresample/resample_template: try to consider src_size more exactly
This should avoid slight differences in the output causes by input
size alignment differences between archs

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-15 06:35:35 +02:00
Michael Niedermayer 5e379cd3ee swresample/resample: simplify index/consumed calculation for the filter = 1 case
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-14 02:23:04 +02:00
Michael Niedermayer 6c8ee74af2 swresample/resample: Fix fractional part of index in the filter_size = 1 filters = 1 case
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-14 02:22:17 +02:00
James Almer 63dbba655e swresample/resample: sse float linear interpolation
About two times faster

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-24 02:34:02 +01:00
James Almer fa25c4c400 swresample/resample: mmx2/sse2 int16 linear interpolation
About three times faster

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-24 02:33:16 +01:00
James Almer 32291ba6ea swresample: add swri_resample_float_sse
At least two times faster than the C version.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-20 06:01:06 +01:00
James Almer 3d48cbc56c swresample: reuse COMMON_CORE asm where possible
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-18 16:08:34 +01:00
James Almer 7c8bf09edd swresample: change COMMON_CORE_INT16 asm from SSSE3 to SSE2
pshuf+paddd is slightly faster than phaddd.
The real gain is in pre-ssse3 processors like AMD K8 and K10, which get
a big boost in performance compared to the mmxext version

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-18 15:00:50 +01:00
Michael Niedermayer b8c55590d5 swr/resample: fix integer overflow, add missing cast
The effects of this are limited to numeric errors in the output

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-04 04:05:59 +01:00
Michael Niedermayer b6a7f66f93 resample: remove disabled debug code
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-06 02:51:26 +01:00
Clément Bœsch 8ea8833979 swr/resample: move templating parameters to template itself.
It has various benefits such as allowing some refactoring, clarifying
the code in the inclusion part, and making the template understandable
in standalone.

This commit is based on the templating method used by Justin Ruggles for
libavresample.
2012-11-15 21:24:49 +01:00
Michael Niedermayer d53f447130 swr: move if() block into the only branch where it can be true.
This should make the code a tiny tiny bit faster.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-15 12:33:40 +01:00
Michael Niedermayer 17da2d9eee swr: reorder/redesign operations to avoid integer overflow.
This fixes a out of array read.

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-15 12:33:40 +01:00
Michael Niedermayer 4ccf6e3971 swr: MMX2 & SSSE3 int16 resample core
about 4 times faster

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-28 00:36:27 +02:00
Michael Niedermayer 0c142e4cda swr: introduce filter_alloc in preparation of SIMD resample optimisations
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-19 03:09:24 +02:00
Michael Niedermayer 80e857c967 swr/resample: optimize C code for the most common case
15% speedup

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-19 03:09:24 +02:00
Michael Niedermayer 6e6dd9995b resample_template: use av_assert
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-06 20:08:57 +02:00
Michael Niedermayer 7f1ae79d38 swr: support float & int32 in the resampler
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-10 13:18:49 +02:00