Commit Graph

48 Commits

Author SHA1 Message Date
Wenbin Chen f4e0664fd1 libavfi/dnn: add LibTorch as one of DNN backend
PyTorch is an open source machine learning framework that accelerates
the path from research prototyping to production deployment. Official
website: https://pytorch.org/. We call the C++ library of PyTorch as
LibTorch, the same below.

To build FFmpeg with LibTorch, please take following steps as
reference:
1. download LibTorch C++ library in
 https://pytorch.org/get-started/locally/,
please select C++/Java for language, and other options as your need.
Please download cxx11 ABI version:
 (libtorch-cxx11-abi-shared-with-deps-*.zip).
2. unzip the file to your own dir, with command
unzip libtorch-shared-with-deps-latest.zip -d your_dir
3. export libtorch_root/libtorch/include and
libtorch_root/libtorch/include/torch/csrc/api/include to $PATH
export libtorch_root/libtorch/lib/ to $LD_LIBRARY_PATH
4. config FFmpeg with ../configure --enable-libtorch \
 --extra-cflag=-I/libtorch_root/libtorch/include \
 --extra-cflag=-I/libtorch_root/libtorch/include/torch/csrc/api/include \
 --extra-ldflags=-L/libtorch_root/libtorch/lib/
5. make

To run FFmpeg DNN inference with LibTorch backend:
./ffmpeg -i input.jpg -vf \
dnn_processing=dnn_backend=torch:model=LibTorch_model.pt -y output.jpg

The LibTorch_model.pt can be generated by Python with torch.jit.script()
api. https://pytorch.org/tutorials/advanced/cpp_export.html. This is
pytorch official guide about how to convert and load torchscript model.
Please note, torch.jit.trace() is not recommanded, since it does
not support ambiguous input size.

Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
2024-03-19 14:48:58 +08:00
Anton Khirnov 1e7d2007c3 all: use designated initializers for AVOption.unit
Makes it robust against adding fields before it, which will be useful in
following commits.

Majority of the patch generated by the following Coccinelle script:

@@
typedef AVOption;
identifier arr_name;
initializer list il;
initializer list[8] il1;
expression tail;
@@
AVOption arr_name[] = { il, { il1,
- tail
+ .unit = tail
}, ...  };

with some manual changes, as the script:
* has trouble with options defined inside macros
* sometimes does not handle options under an #else branch
* sometimes swallows whitespace
2024-02-14 14:53:41 +01:00
Wenbin Chen 3de38b9da5 libavfilter/dnn_interface: use dims to represent shapes
For detect and classify output, width and height make no sence, so
change width, height to dims to represent the shape of tensor. Use
layout and dims to get width, height and channel.

Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
2024-01-28 11:18:06 +08:00
Andreas Rheinhardt 19ffa2ff2d avfilter: Remove unnecessary formats.h inclusions
A filter needs formats.h iff it uses FILTER_QUERY_FUNC();
since lots of filters have been switched to use something
else than FILTER_QUERY_FUNC, they don't need it any more,
but removing this header has been forgotten.
This commit does this; files with formats.h inclusion went down
from 304 to 139 here (it were 449 before the preceding commit).

While just at it, also improve the other headers a bit.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-08-07 09:21:13 +02:00
Andreas Rheinhardt 6d15643173 avfilter/internal: Don't include video.h
internal.h does not depend on video.h (and should not depend on it)
and therefore should not include video.h at all; instead all users
of video.h should include it directly.

Doing so also avoids unnecessary video.h inclusions in files that
don't need it, like most audio filters.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-08-07 09:21:13 +02:00
Zhao Zhili e67af9e7f7 avfilter/vf_dnn_processing: replace magic number by enum value
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2023-06-08 10:50:23 +08:00
Ting Fu 78f95f1088 lavfi/dnn: Remove DNN native backend
According to discussion in
https://etherpad.mit.edu/p/FF_dev_meeting_20221202 and the proposal in
http://ffmpeg.org/pipermail/ffmpeg-devel/2022-December/304534.html,
the DNN native backend should be removed at first step.
All the DNN native backend related codes are deleted.

Signed-off-by: Ting Fu <ting.fu@intel.com>
2023-04-28 11:07:41 +08:00
Ting Fu a9fb141719 lavfi/dnn: Modified DNN native backend related tools and docs.
Will remove native backend, so change the default backend in filters,
and also remove the python scripts which generate native model file.

Signed-off-by: Ting Fu <ting.fu@intel.com>
2023-04-28 11:07:41 +08:00
Cosmin Stejerean 3b375a0c5c lavfi/vf_dnn_processing.c: Fix missing AV_PIX_FMT_GRAY8
Has been removed by mistake in 2003e32f62, readd it to the switch cases.

Signed-off-by: Thilo Borgmann <thilo.borgmann@mail.de>
2022-11-15 13:42:58 +01:00
Shubhanshu Saxena d0a999a0ab libavfilter: Remove DNNReturnType from DNN Module
This patch removes all occurences of DNNReturnType from the DNN module.
This commit replaces DNN_SUCCESS by 0 (essentially the same), so the
functions with DNNReturnType now return 0 in case of success, the negative
values otherwise.

Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
2022-03-12 15:10:28 +08:00
Shubhanshu Saxena e5ce6a6070 libavfilter: Prepare to handle specific error codes in DNN Filters
This commit prepares the filter side to handle specific error codes
from the DNN backends instead of current DNN_ERROR.

Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
2022-03-12 15:10:28 +08:00
Andreas Rheinhardt 31a373ce71 avfilter: Reindentation after query_formats changes
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-10-05 18:58:29 +02:00
Andreas Rheinhardt a341c85c84 avfilter/vf_dnn_processing: Use formats list instead of query function
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-10-05 18:58:28 +02:00
Andreas Rheinhardt b4f5201967 avfilter: Replace query_formats callback with union of list and callback
If one looks at the many query_formats callbacks in existence,
one will immediately recognize that there is one type of default
callback for video and a slightly different default callback for
audio: It is "return ff_set_common_formats_from_list(ctx, pix_fmts);"
for video with a filter-specific pix_fmts list. For audio, it is
the same with a filter-specific sample_fmts list together with
ff_set_common_all_samplerates() and ff_set_common_all_channel_counts().

This commit allows to remove the boilerplate query_formats callbacks
by replacing said callback with a union consisting the old callback
and pointers for pixel and sample format arrays. For the not uncommon
case in which these lists only contain a single entry (besides the
sentinel) enum AVPixelFormat and enum AVSampleFormat fields are also
added to the union to store them directly in the AVFilter,
thereby avoiding a relocation.

The state of said union will be contained in a new, dedicated AVFilter
field (the nb_inputs and nb_outputs fields have been shrunk to uint8_t
in order to create a hole for this new field; this is no problem, as
the maximum of all the nb_inputs is four; for nb_outputs it is only
two).

The state's default value coincides with the earlier default of
query_formats being unset, namely that the filter accepts all formats
(and also sample rates and channel counts/layouts for audio)
provided that these properties agree coincide for all inputs and
outputs.

By using different union members for audio and video filters
the type-unsafety of using the same functions for audio and video
lists will furthermore be more confined to formats.c than before.

When the new fields are used, they will also avoid allocations:
Currently something nearly equivalent to ff_default_query_formats()
is called after every successful call to a query_formats callback;
yet in the common case that the newly allocated AVFilterFormats
are not used at all (namely if there are no free links) these newly
allocated AVFilterFormats are freed again without ever being used.
Filters no longer using the callback will not exhibit this any more.

Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-10-05 17:48:25 +02:00
Shubhanshu Saxena 70b4dca054 libavfilter: Remove synchronous functions from DNN filters
This commit removes the unused sync mode specific code from the DNN
filters since the sync and async mode are now unified from the
filters' perspective.

Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
2021-08-28 16:19:07 +08:00
Shubhanshu Saxena 60b4d07cf6 libavfilter: Unify Execution Modes in DNN Filters
This commit unifies the async and sync mode from the DNN filters'
perspective. As of this commit, the Native backend only supports
synchronous execution mode.

Now the user can switch between async and sync mode by using the
'async' option in the backend_configs. The values can be 1 for
async and 0 for sync mode of execution.

This commit affects the following filters:
1. vf_dnn_classify
2. vf_dnn_detect
3. vf_dnn_processing
4. vf_sr
5. vf_derain

This commit also updates the filters vf_dnn_detect and vf_dnn_classify
to send only the input frame and send NULL as output frame instead of
input frame to the DNN backends.

Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
2021-08-28 16:19:07 +08:00
Andreas Rheinhardt 8be701d9f7 avfilter/avfilter: Add numbers of (in|out)pads directly to AVFilter
Up until now, an AVFilter's lists of input and output AVFilterPads
were terminated by a sentinel and the only way to get the length
of these lists was by using avfilter_pad_count(). This has two
drawbacks: first, sizeof(AVFilterPad) is not negligible
(i.e. 64B on 64bit systems); second, getting the size involves
a function call instead of just reading the data.

This commit therefore changes this. The sentinels are removed and new
private fields nb_inputs and nb_outputs are added to AVFilter that
contain the number of elements of the respective AVFilterPad array.

Given that AVFilter.(in|out)puts are the only arrays of zero-terminated
AVFilterPads an API user has access to (AVFilterContext.(in|out)put_pads
are not zero-terminated and they already have a size field) the argument
to avfilter_pad_count() is always one of these lists, so it just has to
find the filter the list belongs to and read said number. This is slower
than before, but a replacement function that just reads the internal numbers
that users are expected to switch to will be added soon; and furthermore,
avfilter_pad_count() is probably never called in hot loops anyway.

This saves about 49KiB from the binary; notice that these sentinels are
not in .bss despite being zeroed: they are in .data.rel.ro due to the
non-sentinels.

Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-08-20 12:53:58 +02:00
Andreas Rheinhardt 18ec426a86 avfilter/formats: Factor common function combinations out
Several combinations of functions happen quite often in query_format
functions; e.g. ff_set_common_formats(ctx, ff_make_format_list(sample_fmts))
is very common. This commit therefore adds functions that are equivalent
to commonly used function combinations in order to reduce code
duplication.

Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-08-13 17:36:22 +02:00
Guo, Yejun 4718d74c58 lavfi/vf_dnn_processing.c: fix CID 1460603
CID 1460603 (#1 of 1): Improper use of negative value (NEGATIVE_RETURNS)
2021-05-18 09:20:08 +08:00
Andreas Rheinhardt a04ad248a0 avfilter: Constify all AVFilters
This is possible now that the next-API is gone.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 11:48:05 -03:00
Guo, Yejun 76fc6879e2 dnn: add function type for model
So the backend knows the usage of model is for frame processing,
detect, classify, etc. Each function type has different behavior
in backend when handling the input/output data of the model.

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
2021-02-18 09:59:37 +08:00
Guo, Yejun bdce636100 dnn: extract common functions used by different filters
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
2021-02-18 09:59:37 +08:00
Guo, Yejun 2d6af4a501 libavfilter/dnn: use avpriv_report_missing_feature for unsupported features
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
2021-01-22 08:28:13 +08:00
Guo, Yejun 64ea15f050 libavfilter/dnn: add batch mode for async execution
the default number of batch_size is 1

Signed-off-by: Xie, Lin <lin.xie@intel.com>
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
2021-01-15 08:59:54 +08:00
Guo, Yejun 97f520b700 dnn: fix issue when pthread is not supported
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
2020-12-31 08:31:17 +08:00
Guo, Yejun c720286ee3 vf_dnn_processing.c: add async support
Signed-off-by: Xie, Lin <lin.xie@intel.com>
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
2020-12-29 09:31:06 +08:00
Guo, Yejun 5024286465 dnn_interface: change from 'void *userdata' to 'AVFilterContext *filter_ctx'
'void *' is too flexible, since we can derive info from
AVFilterContext*, so we just unify the interface with this data
structure.

Signed-off-by: Xie, Lin <lin.xie@intel.com>
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
2020-12-29 09:31:06 +08:00
Guo, Yejun 1b64954e42 vf_dnn_processing.c: replace filter_frame with activate func
with this change, dnn_processing can use DNN async interface later.

Signed-off-by: Xie, Lin <lin.xie@intel.com>
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
2020-12-29 09:31:06 +08:00
Ting Fu 5dbabb020e dnn: add NV12 pixel format support
Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
2020-12-22 10:53:35 +08:00
Guo, Yejun e71d73b096 dnn: add a new interface DNNModel.get_output
for some cases (for example, super resolution), the DNN model changes
the frame size which impacts the filter behavior, so the filter needs
to know the out frame size at very beginning.

Currently, the filter reuses DNNModule.execute_model to query the
out frame size, it is not clear from interface perspective, so add
a new explict interface DNNModel.get_output for such query.
2020-09-21 21:26:56 +08:00
Guo, Yejun fce3e3e137 dnn: put DNNModel.set_input and DNNModule.execute_model together
suppose we have a detect and classify filter in the future, the
detect filter generates some bounding boxes (BBox) as AVFrame sidedata,
and the classify filter executes DNN model for each BBox. For each
BBox, we need to crop the AVFrame, copy data to DNN model input and do
the model execution. So we have to save the in_frame at DNNModel.set_input
and use it at DNNModule.execute_model, such saving is not feasible
when we support async execute_model.

This patch sets the in_frame as execution_model parameter, and so
all the information are put together within the same function for
each inference. It also makes easy to support BBox async inference.
2020-09-21 21:26:56 +08:00
Guo, Yejun 2003e32f62 dnn: change dnn interface to replace DNNData* with AVFrame*
Currently, every filter needs to provide code to transfer data from
AVFrame* to model input (DNNData*), and also from model output
(DNNData*) to AVFrame*. Actually, such transfer can be implemented
within DNN module, and so filter can focus on its own business logic.

DNN module also exports the function pointer pre_proc and post_proc
in struct DNNModel, just in case that a filter has its special logic
to transfer data between AVFrame* and DNNData*. The default implementation
within DNN module is used if the filter does not set pre/post_proc.
2020-09-21 21:26:56 +08:00
Guo, Yejun 6918e240d7 dnn: add userdata for load model parameter
the userdata will be used for the interaction between AVFrame and DNNData
2020-09-21 21:26:56 +08:00
Guo, Yejun 0f7a99e37a dnn: move output name from DNNModel.set_input_output to DNNModule.execute_model
currently, output is set both at DNNModel.set_input_output and
DNNModule.execute_model, it makes sense that the output name is
provided at model inference time so all the output info is set
at a single place.

and so DNNModel.set_input_output is renamed to DNNModel.set_input

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
2020-08-25 09:02:59 +08:00
Guo, Yejun 0a51abe8ab dnn: add backend options when load the model
different backend might need different options for a better performance,
so, add the parameter into dnn interface, as a preparation.

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
2020-08-12 15:43:40 +08:00
Guo, Yejun 9bcf2aa477 vf_dnn_processing.c: add dnn backend openvino
We can try with the srcnn model from sr filter.
1) get srcnn.pb model file, see filter sr
2) convert srcnn.pb into openvino model with command:
python mo_tf.py --input_model srcnn.pb --data_type=FP32 --input_shape [1,960,1440,1] --keep_shape_ops

See the script at https://github.com/openvinotoolkit/openvino/tree/master/model-optimizer
We'll see srcnn.xml and srcnn.bin at current path, copy them to the
directory where ffmpeg is.

I have also uploaded the model files at https://github.com/guoyejun/dnn_processing/tree/master/models

3) run with openvino backend:
ffmpeg -i input.jpg -vf format=yuv420p,scale=w=iw*2:h=ih*2,dnn_processing=dnn_backend=openvino:model=srcnn.xml:input=x:output=srcnn/Maximum -y srcnn.ov.jpg
(The input.jpg resolution is 720*480)

Also copy the logs on my skylake machine (4 cpus) locally with openvino backend
and tensorflow backend. just for your information.

$ time ./ffmpeg -i 480p.mp4 -vf format=yuv420p,scale=w=iw*2:h=ih*2,dnn_processing=dnn_backend=tensorflow:model=srcnn.pb:input=x:output=y -y srcnn.tf.mp4
…
frame=  343 fps=2.1 q=31.0 Lsize=    2172kB time=00:00:11.76 bitrate=1511.9kbits/s speed=0.0706x
video:1973kB audio:187kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.517637%
[aac @ 0x2f5db80] Qavg: 454.353
real    2m46.781s
user    9m48.590s
sys     0m55.290s

$ time ./ffmpeg -i 480p.mp4 -vf format=yuv420p,scale=w=iw*2:h=ih*2,dnn_processing=dnn_backend=openvino:model=srcnn.xml:input=x:output=srcnn/Maximum -y srcnn.ov.mp4
…
frame=  343 fps=4.0 q=31.0 Lsize=    2172kB time=00:00:11.76 bitrate=1511.9kbits/s speed=0.137x
video:1973kB audio:187kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.517640%
[aac @ 0x31a9040] Qavg: 454.353
real    1m25.882s
user    5m27.004s
sys     0m0.640s

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
2020-07-02 09:56:55 +08:00
Guo, Yejun 2114c42418 avfilter/vf_dnn_processing.c: fix typo for the linesize of dnn data
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
2020-04-07 11:03:25 +08:00
Linjie Fu acc6f632b4 lavfi/vf_dnn_processing: Fix compile warning of mixed declarations and code
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
2020-03-19 14:27:23 +08:00
Guo, Yejun e35f966853 avfilter/vf_dnn_processing.c: add frame size change support for planar yuv format
The Y channel is handled by dnn, and also resized by dnn. The UV channels
are resized with swscale.

The command to use espcn.pb (see vf_sr) looks like:
./ffmpeg -i 480p.jpg -vf format=yuv420p,dnn_processing=dnn_backend=tensorflow:model=espcn.pb:input=x:output=y -y tmp.espcn.jpg

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Reviewed-by: Pedro Arthur <bygrandao@gmail.com>
2020-03-12 18:22:51 +08:00
Guo, Yejun bd50453894 avfilter/vf_dnn_processing.c: add planar yuv format support
Only the Y channel is handled by dnn, the UV channels are copied
without changes.

The command to use srcnn.pb (see vf_sr) looks like:
./ffmpeg -i 480p.jpg -vf format=yuv420p,scale=w=iw*2:h=ih*2,dnn_processing=dnn_backend=tensorflow:model=srcnn.pb:input=x:output=y -y srcnn.jpg

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Reviewed-by: Pedro Arthur <bygrandao@gmail.com>
2020-03-12 18:22:39 +08:00
Guo, Yejun d86a8c056b avfilter/vf_dnn_processing.c: use swscale for uint8<->float32 convert
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Reviewed-by: Pedro Arthur <bygrandao@gmail.com>
2020-03-12 18:22:18 +08:00
Guo, Yejun 4e1ae43b17 lavfi/dnn_processing: refine code to use function av_image_copy_plane for data copy
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
2020-01-14 11:29:43 -03:00
Guo, Yejun 37d24a6c8f vf_dnn_processing: add support for more formats gray8 and grayf32
The following is a python script to halve the value of the gray
image. It demos how to setup and execute dnn model with python+tensorflow.
It also generates .pb file which will be used by ffmpeg.

import tensorflow as tf
import numpy as np
from skimage import color
from skimage import io
in_img = io.imread('input.jpg')
in_img = color.rgb2gray(in_img)
io.imsave('ori_gray.jpg', np.squeeze(in_img))
in_data = np.expand_dims(in_img, axis=0)
in_data = np.expand_dims(in_data, axis=3)
filter_data = np.array([0.5]).reshape(1,1,1,1).astype(np.float32)
filter = tf.Variable(filter_data)
x = tf.placeholder(tf.float32, shape=[1, None, None, 1], name='dnn_in')
y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID', name='dnn_out')
sess=tf.Session()
sess.run(tf.global_variables_initializer())
graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, ['dnn_out'])
tf.train.write_graph(graph_def, '.', 'halve_gray_float.pb', as_text=False)
print("halve_gray_float.pb generated, please use \
path_to_ffmpeg/tools/python/convert.py to generate halve_gray_float.model\n")
output = sess.run(y, feed_dict={x: in_data})
output = output * 255.0
output = output.astype(np.uint8)
io.imsave("out.jpg", np.squeeze(output))

To do the same thing with ffmpeg:
- generate halve_gray_float.pb with the above script
- generate halve_gray_float.model with tools/python/convert.py
- try with following commands
  ./ffmpeg -i input.jpg -vf format=grayf32,dnn_processing=model=halve_gray_float.model:input=dnn_in:output=dnn_out:dnn_backend=native out.native.png
  ./ffmpeg -i input.jpg -vf format=grayf32,dnn_processing=model=halve_gray_float.pb:input=dnn_in:output=dnn_out:dnn_backend=tensorflow out.tf.png

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
2020-01-07 10:51:38 -03:00
Guo, Yejun 04e6f8a143 vf_dnn_processing: remove parameter 'fmt'
do not request AVFrame's format in vf_ddn_processing with 'fmt',
but to add another filter for the format.

command examples:
./ffmpeg -i input.jpg -vf format=bgr24,dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_out:dnn_backend=native -y out.native.png
./ffmpeg -i input.jpg -vf format=rgb24,dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_out:dnn_backend=native -y out.native.png

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
2020-01-07 10:35:59 -03:00
Guo, Yejun ed9fc2e3c5 avfilter/vf_dnn_processing: refine code for better naming
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
2019-12-13 11:41:10 -03:00
leozhang c79307b7de avfilter/vf_dnn_processing: correct duplicate statement
Signed-off-by: leozhang <leozhang@qiyi.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-11-08 14:57:01 +01:00
Guo, Yejun f6e942251c avfilter/vf_dnn_processing: fix fate-source
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-11-08 14:56:38 +01:00
Guo, Yejun 4d980a8ceb avfilter/vf_dnn_processing: add a generic filter for image proccessing with dnn networks
This filter accepts all the dnn networks which do image processing.
Currently, frame with formats rgb24 and bgr24 are supported. Other
formats such as gray and YUV will be supported next. The dnn network
can accept data in float32 or uint8 format. And the dnn network can
change frame size.

The following is a python script to halve the value of the first
channel of the pixel. It demos how to setup and execute dnn model
with python+tensorflow. It also generates .pb file which will be
used by ffmpeg.

import tensorflow as tf
import numpy as np
import imageio
in_img = imageio.imread('in.bmp')
in_img = in_img.astype(np.float32)/255.0
in_data = in_img[np.newaxis, :]
filter_data = np.array([0.5, 0, 0, 0, 1., 0, 0, 0, 1.]).reshape(1,1,3,3).astype(np.float32)
filter = tf.Variable(filter_data)
x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in')
y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID', name='dnn_out')
sess=tf.Session()
sess.run(tf.global_variables_initializer())
output = sess.run(y, feed_dict={x: in_data})
graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, ['dnn_out'])
tf.train.write_graph(graph_def, '.', 'halve_first_channel.pb', as_text=False)
output = output * 255.0
output = output.astype(np.uint8)
imageio.imsave("out.bmp", np.squeeze(output))

To do the same thing with ffmpeg:
- generate halve_first_channel.pb with the above script
- generate halve_first_channel.model with tools/python/convert.py
- try with following commands
  ./ffmpeg -i input.jpg -vf dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_out:fmt=rgb24:dnn_backend=native -y out.native.png
  ./ffmpeg -i input.jpg -vf dnn_processing=model=halve_first_channel.pb:input=dnn_in:output=dnn_out:fmt=rgb24:dnn_backend=tensorflow -y out.tf.png

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
2019-11-07 15:46:00 -03:00