This fixes the same overflow as in the RGB48/16-bit YUV scaling;
some filters can overflow both negatively and positively (e.g.
spline/lanczos), so we bias a signed integer so it's "half signed"
and "half unsigned", and can cover overflows in both directions
while maintaining full 31-bit depth.
Signed-off-by: Mans Rullgard <mans@mansr.com>
We're shifting individual components (8-bit, unsigned) left by 24,
so making them unsigned should give the same results without the
overflow.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
For certain types of filters where the intermediate sum of coefficients
can go above the fixed-point equivalent of 1.0 in the middle of a filter,
the sum of a 31-bit calculation can overflow in both directions and can
thus not be represented in a 32-bit signed or unsigned integer. To work
around this, we subtract 0x40000000 from a signed integer base, so that
we're halfway signed/unsigned, which makes it fit even if it overflows.
After the filter finishes, we add the scaled bias back after a shift.
We use the same trick for 16-bit bpc YUV output routines.
Signed-off-by: Mans Rullgard <mans@mansr.com>
As old bits are shifted out of the accumulator, they cause signed
overflows when they reach the end. Making the variable unsigned fixes
this.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This allows using more specific implementations for chroma/luma, e.g.
we can make assumptions on filterSize being constant, thus avoiding
that test at runtime.
We operated on 31-bits, but with e.g. lanczos scaling, values can
add up to beyond 0x80000000, thus leading to output of zeroes. Drop
one bit of precision fixes this.
Remove unused variables "flags" and "dstFormat" in yuv2packed1,
merge source rows per plane for yuv2packed[12], and make every
source argument int16_t (some where invalidly set to uint16_t).
This prevents stack pollution and is part of the Great Evil Plan
to simplify swscale.
This is part of the Great Evil Plan to simplify swscale. Note that
you'll see some code duplication between the output functions for
different RGB variants, and even between packed-YUV and RGB
variants. This is intentional because it improves readability.
Inline functions are easier to read, maintain, modify and test,
which justifies the slightly increased source size. This patch
also adds support for non-native endianness RGB15/16 and fixes
isSupportedOutput() to no longer claim that we support writing
non-native RGB565/555/444.
Remove inline keyword from functions that are never inlined.
Use av_always_inline for functions that should be force-inlined
for performance reasons. Use av_cold for init functions.