Discussion:
[PATCH 0/5] VAAPI support infrastructure, encoders
(too old to reply)
Mark Thompson
2016-01-17 17:33:31 UTC
Permalink
Hi all,

This patch series contains the following:

* Some VAAPI infrastructure in libavutil. Initialisation code, along
with routines to handle mapping and copying to/from VA surfaces.
Doesn't really fit as a public API, but has to go there because it's
used by the parts in both libavfilter and libavcodec.

* VAAPI decode hwaccel helper. This allows use of the VAAPI hwaccel
with ffmpeg itself, rather than only through third-party use of libavcodec.

* VAAPI H.264 encoder. This is mostly a token effort to see things
working - I moved on to the H.265 encoder once it worked at all. The
other patch offered on this list recently (the one that looks
suspiciously like an unacknowleged copy of libva
test/encode/h264encode.c) would certainly offer better quality for now,
though it lacks the integration to use the video processing at the same
time.

* VAAPI H.265 encoder. This is a simple working encoder: it only
supports IDR and P frames at fixed QP so far, so the output is not very
good. I am intending to work on this to make it better, see below.

* VAAPI conversion filter. It exists to allow sensible transforms
(scale, colour-convert from RGB) using hardware in front of the encoder,
and is currently useless for anything else because it can only output
VAAPI surfaces. Adding more outputs would be straightforward, though
not very useful without being able to feed it VAAPI surfaces as input
from a decoder (see below). Another improvement would be to support
more available VPP filters - deinterlacing, other colour transforms, etc.

All of this code was written by me alone. Some inspiration but no code
was taken from libva examples and from libyami (mostly around what codec
options I am allowed to set).

Tested with Intel Quick Sync gen 7 (Haswell/4500U), gen 8
(Braswell/N3700) and gen 9 (Skylake/6300), all with libva 1.6.2 and
Linux 4.3.3 (current Debian stretch/testing).


Problems:

* The bits in libavutil don't really make a very good public API. Needs
documentation if it does become one. "AVVAAPI" is a terrible namespace
prefix to write everywhere. There should probably be a common way to
select the default VAAPI connection (top-level option on ffmpeg itself?).

* The configure tests need to make a more intelligent decision of when
to enable things. (H.265 encode is not present before libva 1.6.0, say.)

* There may well be memory leaks. I don't fully understand how the
reference counting and auxiliary buffers in AVFrames work, so it would
be very helpful if someone familiar with them could have a look at it
(and suggest a better method which I can implement, if appropriate).

* Timestamps do something funny - I pass the pts values through in what
I think is the correct way, but without '-vsync 0' it perpetually
complains that they are somehow wrong ("Past duration ... too large").
Maybe some of the clocks aren't running at the same rate? Someone
familiar with this can probably instantly explain what I've done wrong.

* The colour conversion isn't quite working as I would hope - it seems
to saturate some colours? This is probably fixable by being clearer
about the colour spaces involved.

* The encoders aren't very good.


Things I intend to work on:

* Clean this up to get it integrated.

* Improve the H.265 encoder: B frames and some sort of GOP
configuration, along with better bitrate control. Not sure what would
be useful to do after that; suggestions welcome.

* Investigate the various copies to/from VAAPI surfaces. It should be
possible to improve this, at least in some cases.


Examples:

./ffmpeg -hwaccel vaapi -i in.mp4 -f null /dev/null

Decode input using the vaapi hwaccel, throw the output away.

./ffmpeg -vsync 0 -f x11grab -s 1920x1080 -i :0.0+0,0 -an -c:v
vaapi_hevc -qp 18 -idr_interval 120 out.mp4

Capture the X11 screen and encode it to H.265 at low QP. s/hevc/h264/
to get H.264 output instead, which will work on non-Skylake. (Remove
'-vsync 0' to see timestamp problem mentioned above.)

./ffmpeg -vsync 0 -f x11grab -s 1920x1080 -i :0.0+0,0 -an -vf vaapi_conv
-c:v vaapi_hevc -qp 18 -idr_interval 120 out.mp4

Same as previous, but with the colour-conversion from RGB to NV12 now
happening in hardware using the vaapi_conv filter (VAAPI surfaces are
passed between the filter and the encoder).

./ffmpeg -vsync 0 -hwaccel vaapi -i in.mp4 -vf vaapi_conv=size=1280x720
-an -c:v vaapi_hevc -qp 30 -idr_interval 120 out.mp4

Decode input using the vaapi hwaccel, scale to 1280x720 with vaapi_conv
and encode with H.265.

This works, but unfortunately not entirely in hardware. The hwaccel
setup very much wants a normal output from the hwaccel helper, so you
end up copying the surface into a separate buffer, then back into a
different surface on the filter input. I'm not sure how to fix this
without hacking the decoder side horribly (it needs to select the VAAPI
output format very early, before the filter is set up).


Other random notes:

H.265/HEVC naming: what should I be using here? Existing practice isn't
very clear: it seems to be mostly H.264 (ITU naming) and HEVC (ISO/MPEG
naming), but with some instances of the other one in each case. I
mildly prefer H.265, but I've used hevc in all variable names to try to
stay consistent with (some of the) existing practice. It would be easy
to change if there is consensus the other way.

A warning to anyone playing with the H.265 encoder on Skylake: the
drivers are not robust to errors! It is not difficult to hang your GPU
if you fiddle with the low-level parameters. It did always recover for
me (the kernel driver spots the hang and resets everything after a few
seconds), but that sounds like undefined behaviour so YMMV. (Easy
reproduction case if anyone feels like debugging some drivers: set
log2_diff_max_min_luma_coding_block_size to 3 rather than 2.)


All comments welcome.

Thanks,

- Mark
Mark Thompson
2016-01-17 17:34:55 UTC
Permalink
From 2442c1aca8778167c2e60a34d03ed452737f1366 Mon Sep 17 00:00:00 2001
From: Mark Thompson <***@jkqxz.net>
Date: Sun, 17 Jan 2016 15:48:54 +0000
Subject: [PATCH 1/5] libavutil: Some VAAPI infrastructure

---
configure | 4 +
libavutil/Makefile | 1 +
libavutil/vaapi.c | 732
+++++++++++++++++++++++++++++++++++++++++++++++++++++
libavutil/vaapi.h | 116 +++++++++
4 files changed, 853 insertions(+)
create mode 100644 libavutil/vaapi.c
create mode 100644 libavutil/vaapi.h

diff --git a/configure b/configure
index 7cef6f5..1c77015 100755
--- a/configure
+++ b/configure
@@ -5739,6 +5739,10 @@ enabled vaapi && enabled xlib &&
check_lib2 "va/va.h va/va_x11.h" vaGetDisplay -lva -lva-x11 &&
enable vaapi_x11

+enabled vaapi &&
+ check_lib2 "va/va.h va/va_drm.h" vaGetDisplayDRM -lva -lva-drm &&
+ enable vaapi_drm
+
enabled vdpau &&
check_cpp_condition vdpau/vdpau.h "defined
VDP_DECODER_PROFILE_MPEG4_PART2_ASP" ||
disable vdpau
diff --git a/libavutil/Makefile b/libavutil/Makefile
index bf8c713..8025f9f 100644
--- a/libavutil/Makefile
+++ b/libavutil/Makefile
@@ -146,6 +146,7 @@ OBJS-$(!HAVE_ATOMICS_NATIVE) += atomic.o
\

OBJS-$(CONFIG_LZO) += lzo.o
OBJS-$(CONFIG_OPENCL) += opencl.o opencl_internal.o
+OBJS-$(CONFIG_VAAPI) += vaapi.o

OBJS += $(COMPAT_OBJS:%=../compat/%)

diff --git a/libavutil/vaapi.c b/libavutil/vaapi.c
new file mode 100644
index 0000000..cf0c790
--- /dev/null
+++ b/libavutil/vaapi.c
@@ -0,0 +1,732 @@
+/*
+ * VAAPI helper functions.
+ *
+ * Copyright (C) 2016 Mark Thompson <***@jkqxz.net>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301 USA
+ */
+
+#include <string.h>
+
+#include <unistd.h>
+#include <fcntl.h>
+
+#include "vaapi.h"
+
+#include <va/va_x11.h>
+#include <va/va_drm.h>
+
+#include "avassert.h"
+#include "imgutils.h"
+#include "pixfmt.h"
+
+
+static const AVClass vaapi_connection_class = {
+ .class_name = "VAAPI/connection",
+ .item_name = av_default_item_name,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+static const AVClass vaapi_pipeline_class = {
+ .class_name = "VAAPI/pipeline",
+ .item_name = av_default_item_name,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+typedef struct AVVAAPIConnection {
+ const AVClass *class;
+
+ char *device_string;
+ int refcount;
+ struct AVVAAPIInstance *next;
+
+ VADisplay display;
+ int initialised;
+ int version_major, version_minor;
+
+ enum {
+ AV_VAAPI_CONNECTION_NONE = 0,
+ AV_VAAPI_CONNECTION_DRM,
+ AV_VAAPI_CONNECTION_X11,
+ /* ?
+ AV_VAAPI_CONNECTION_GLX,
+ AV_VAAPI_CONNECTION_WAYLAND,
+ */
+ } connection_type;
+ union {
+ void *x11_display;
+ int drm_fd;
+ };
+} AVVAAPIConnection;
+
+static int av_vaapi_connection_uninit(AVVAAPIConnection *ctx)
+{
+ if(ctx->initialised) {
+ vaTerminate(ctx->display);
+ ctx->display = 0;
+ ctx->initialised = 0;
+ }
+
+ switch(ctx->connection_type) {
+
+ case AV_VAAPI_CONNECTION_DRM:
+ if(ctx->drm_fd >= 0) {
+ close(ctx->drm_fd);
+ ctx->drm_fd = -1;
+ }
+ break;
+
+ case AV_VAAPI_CONNECTION_X11:
+ if(ctx->x11_display) {
+ XCloseDisplay(ctx->x11_display);
+ ctx->x11_display = 0;
+ }
+ break;
+
+ }
+
+ return 0;
+}
+
+static int av_vaapi_connection_init(AVVAAPIConnection *ctx, const char
*device)
+{
+ VAStatus vas;
+ int err;
+
+ ctx->class = &vaapi_connection_class;
+ if(device)
+ ctx->device_string = av_strdup(device);
+
+ // If the device name is not provided at all, assume we are in X
and can
+ // connect to the display in DISPLAY. If we do get a device name
and it
+ // begins with a type indicator, use that. Otherwise, try to guess the
+ // answer from the content of the name.
+ if(!device) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_X11;
+ } else if(!strncmp(device, "drm:", 4)) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_DRM;
+ device += 4;
+ } else if(!strncmp(device, "x11:", 4)) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_X11;
+ device += 4;
+ } else {
+ if(strchr(device, '/')) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_DRM;
+ } else if(strchr(device, ':')) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_X11;
+ } else {
+ // No idea, just give up.
+ return AVERROR(EINVAL);
+ }
+ }
+
+ switch(ctx->connection_type) {
+
+ case AV_VAAPI_CONNECTION_DRM:
+ ctx->drm_fd = open(device, O_RDWR);
+ if(ctx->drm_fd < 0) {
+ av_log(ctx, AV_LOG_ERROR, "Cannot open DRM device %s.\n",
+ device);
+ err = AVERROR(errno);
+ goto fail;
+ }
+ ctx->display = vaGetDisplayDRM(ctx->drm_fd);
+ if(!ctx->display) {
+ av_log(ctx, AV_LOG_ERROR, "Cannot open the VA display (from
DRM "
+ "device %s).\n", device);
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+ break;
+
+ case AV_VAAPI_CONNECTION_X11:
+ ctx->x11_display = XOpenDisplay(device); // device might be NULL.
+ if(!ctx->x11_display) {
+ av_log(ctx, AV_LOG_ERROR, "Cannot open X11 display %s.\n",
+ XDisplayName(device));
+ err = AVERROR(ENOENT);
+ goto fail;
+ }
+ ctx->display = vaGetDisplay(ctx->x11_display);
+ if(!ctx->display) {
+ av_log(ctx, AV_LOG_ERROR, "Cannot open the VA display (from
X11 "
+ "display %s).\n", XDisplayName(device));
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+ break;
+
+ default:
+ av_assert0(0);
+ }
+
+ vas = vaInitialize(ctx->display,
+ &ctx->version_major, &ctx->version_minor);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to initialise VAAPI: %d (%s).\n",
+ vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+ ctx->initialised = 1;
+
+ av_log(ctx, AV_LOG_INFO, "Initialised VAAPI connection: version
%d.%d\n",
+ ctx->version_major, ctx->version_minor);
+
+ return 0;
+
+ fail:
+ av_vaapi_connection_uninit(ctx);
+ return err;
+}
+
+static AVVAAPIConnection *av_vaapi_connection_list;
+
+int av_vaapi_instance_init(AVVAAPIInstance *instance, const char *device)
+{
+ AVVAAPIConnection *ctx;
+ int err;
+
+ for(ctx = av_vaapi_connection_list; ctx; ctx = ctx->next) {
+ if((device == 0 && ctx->device_string == 0) ||
+ (device && ctx->device_string &&
+ !strcmp(device, ctx->device_string)))
+ break;
+ }
+
+ if(ctx) {
+ av_log(ctx, AV_LOG_INFO, "New VAAPI instance connected to
existing "
+ "instance (%s).\n", device ? device : "default");
+ ++ctx->refcount;
+ instance->connection = ctx;
+ instance->display = ctx->display;
+ return 0;
+ }
+
+ ctx = av_mallocz(sizeof(AVVAAPIConnection));
+ if(!ctx)
+ return AVERROR(ENOMEM);
+
+ err = av_vaapi_connection_init(ctx, device);
+ if(err)
+ return err;
+
+ instance->display = ctx->display;
+ ctx->next = av_vaapi_connection_list;
+ av_vaapi_connection_list = ctx;
+
+ av_log(ctx, AV_LOG_INFO, "New VAAPI instance (%s).\n",
+ device ? device : "default");
+
+ return 0;
+}
+
+int av_vaapi_instance_uninit(AVVAAPIInstance *instance)
+{
+ AVVAAPIConnection *ctx = instance->connection;
+
+ if(!ctx)
+ return AVERROR(EINVAL);
+
+ if(ctx->refcount <= 0) {
+ av_log(ctx, AV_LOG_ERROR, "Tried to uninit VAAPI connection with "
+ "refcount = %d < 0.\n", ctx->refcount);
+ return AVERROR(EINVAL);
+ }
+
+ --ctx->refcount;
+
+ if(ctx->refcount == 0) {
+ AVVAAPIConnection *iter, *prev;
+ prev = 0;
+ for(iter = av_vaapi_connection_list; iter;
+ prev = iter, iter = iter->next) {
+ if(iter == ctx) {
+ if(prev)
+ prev->next = ctx->next;
+ else
+ av_vaapi_connection_list = ctx->next;
+ break;
+ }
+ }
+ if(!iter) {
+ av_log(ctx, AV_LOG_WARNING, "Tried to uninit VAAPI connection "
+ "not in connection list?\n");
+ // Not fatal.
+ }
+
+ av_vaapi_connection_uninit(ctx);
+ av_free(ctx);
+ memset(instance, 0, sizeof(*instance));
+ }
+
+ return 0;
+}
+
+
+static int vaapi_create_surfaces(AVVAAPIInstance *instance,
+ AVVAAPISurfaceConfig *config,
+ AVVAAPISurface *surfaces,
+ VASurfaceID *ids)
+{
+ VAStatus vas;
+ int i;
+
+ vas = vaCreateSurfaces(instance->display, config->rt_format,
+ config->width, config->height, ids,
config->count,
+ config->attributes, config->attribute_count);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance->connection, AV_LOG_ERROR, "Failed to create "
+ "surfaces: %d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR(EINVAL);
+ }
+
+ for(i = 0; i < config->count; i++) {
+ surfaces[i].id = ids[i];
+ surfaces[i].refcount = 0;
+ surfaces[i].instance = instance;
+ surfaces[i].config = config;
+ av_log(instance->connection, AV_LOG_TRACE, "Created VA surface "
+ "%d: %#x.\n", i, surfaces[i].id);
+ }
+
+ return 0;
+}
+
+static void vaapi_destroy_surfaces(AVVAAPIInstance *instance,
+ AVVAAPISurfaceConfig *config,
+ AVVAAPISurface *surfaces,
+ VASurfaceID *ids)
+{
+ VAStatus vas;
+ int i;
+
+ for(i = 0; i < config->count; i++) {
+ av_assert0(surfaces[i].id == ids[i]);
+ if(surfaces[i].refcount > 0)
+ av_log(instance->connection, AV_LOG_WARNING, "Destroying "
+ "surface %#x which is still in use.\n", surfaces[i].id);
+ av_assert0(surfaces[i].instance == instance);
+ av_assert0(surfaces[i].config == config);
+ }
+
+ vas = vaDestroySurfaces(instance->display, ids, config->count);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to destroy surfaces: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ }
+}
+
+int av_vaapi_pipeline_init(AVVAAPIPipelineContext *ctx,
+ AVVAAPIInstance *instance,
+ AVVAAPIPipelineConfig *config,
+ AVVAAPISurfaceConfig *input,
+ AVVAAPISurfaceConfig *output)
+{
+ VAStatus vas;
+ int err;
+
+ // Currently this only supports a pipeline which actually creates
+ // output surfaces. An intra-only encoder (e.g. JPEG) won't, so
+ // some modification would be required to make that work.
+ if(!output)
+ return AVERROR(EINVAL);
+
+ memset(ctx, 0, sizeof(*ctx));
+ ctx->class = &vaapi_pipeline_class;
+
+ ctx->instance = instance;
+ ctx->config = config;
+
+ vas = vaCreateConfig(instance->display, config->profile,
+ config->entrypoint, config->attributes,
+ config->attribute_count, &ctx->config_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create pipeline
configuration: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail_config;
+ }
+
+ if(input) {
+ ctx->input_surfaces = av_calloc(input->count,
sizeof(AVVAAPISurface));
+ if(!ctx->input_surfaces) {
+ err = AVERROR(ENOMEM);
+ goto fail_alloc_input_surfaces;
+ }
+
+ err = vaapi_create_surfaces(instance, input, ctx->input_surfaces,
+ ctx->input_surface_ids);
+ if(err)
+ goto fail_create_input_surfaces;
+ ctx->input = input;
+ } else {
+ av_log(ctx, AV_LOG_INFO, "No input surfaces.\n");
+ ctx->input = 0;
+ }
+
+ if(output) {
+ ctx->output_surfaces = av_calloc(output->count,
sizeof(AVVAAPISurface));
+ if(!ctx->output_surfaces) {
+ err = AVERROR(ENOMEM);
+ goto fail_alloc_output_surfaces;
+ }
+
+ err = vaapi_create_surfaces(instance, output, ctx->output_surfaces,
+ ctx->output_surface_ids);
+ if(err)
+ goto fail_create_output_surfaces;
+ ctx->output = output;
+ } else {
+ av_log(ctx, AV_LOG_INFO, "No output surfaces.\n");
+ ctx->output = 0;
+ }
+
+ vas = vaCreateContext(instance->display, ctx->config_id,
+ output->width, output->height,
+ VA_PROGRESSIVE,
+ ctx->output_surface_ids, output->count,
+ &ctx->context_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create pipeline context: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail_context;
+ }
+
+ av_log(ctx, AV_LOG_INFO, "VAAPI pipeline initialised: config %#x "
+ "context %#x.\n", ctx->config_id, ctx->context_id);
+ if(input)
+ av_log(ctx, AV_LOG_INFO, " Input: %u surfaces of %ux%u.\n",
+ input->count, input->width, input->height);
+ if(output)
+ av_log(ctx, AV_LOG_INFO, " Output: %u surfaces of %ux%u.\n",
+ output->count, output->width, output->height);
+
+ return 0;
+
+ fail_context:
+ vaapi_destroy_surfaces(instance, output, ctx->output_surfaces,
+ ctx->output_surface_ids);
+ fail_create_output_surfaces:
+ av_freep(&ctx->output_surfaces);
+ fail_alloc_output_surfaces:
+ vaapi_destroy_surfaces(instance, input, ctx->input_surfaces,
+ ctx->input_surface_ids);
+ fail_create_input_surfaces:
+ av_freep(&ctx->input_surfaces);
+ fail_alloc_input_surfaces:
+ vaDestroyConfig(instance->display, ctx->config_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to destroy pipeline "
+ "configuration: %d (%s).\n", vas, vaErrorStr(vas));
+ }
+ fail_config:
+ return err;
+}
+
+int av_vaapi_pipeline_uninit(AVVAAPIPipelineContext *ctx)
+{
+ VAStatus vas;
+
+ av_assert0(ctx->instance);
+ av_assert0(ctx->config);
+
+ vas = vaDestroyContext(ctx->instance->display, ctx->context_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to destroy pipeline context: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ }
+
+ if(ctx->output) {
+ vaapi_destroy_surfaces(ctx->instance, ctx->output,
+ ctx->output_surfaces,
+ ctx->output_surface_ids);
+ av_freep(&ctx->output_surfaces);
+ }
+
+ if(ctx->input) {
+ vaapi_destroy_surfaces(ctx->instance, ctx->input,
+ ctx->input_surfaces,
+ ctx->input_surface_ids);
+ av_freep(&ctx->input_surfaces);
+ }
+
+ vaDestroyConfig(ctx->instance->display, ctx->config_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to destroy pipeline
configuration: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ }
+
+ return 0;
+}
+
+static void vaapi_codec_release_surface(void *opaque, uint8_t *data)
+{
+ AVVAAPISurface *surface = opaque;
+
+ av_assert0(surface->refcount > 0);
+ --surface->refcount;
+}
+
+static int vaapi_get_surface(AVVAAPIPipelineContext *ctx,
+ AVVAAPISurfaceConfig *config,
+ AVVAAPISurface *surfaces, AVFrame *frame)
+{
+ AVVAAPISurface *surface;
+ int i;
+
+ for(i = 0; i < config->count; i++) {
+ if(surfaces[i].refcount == 0)
+ break;
+ }
+ if(i >= config->count) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface "
+ "(%d in use).\n", config->count);
+ return AVERROR(ENOMEM);
+ }
+ surface = &surfaces[i];
+
+ ++surface->refcount;
+ frame->data[3] = (uint8_t*)(uintptr_t)surface->id;
+ frame->buf[0] = av_buffer_create((uint8_t*)surface, 0,
+ &vaapi_codec_release_surface,
+ surface, AV_BUFFER_FLAG_READONLY);
+ if(!frame->buf[0]) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate dummy buffer "
+ "for surface %#x.\n", surface->id);
+ return AVERROR(ENOMEM);
+ }
+
+ frame->format = AV_PIX_FMT_VAAPI;
+ frame->width = config->width;
+ frame->height = config->height;
+
+ return 0;
+}
+
+int av_vaapi_get_input_surface(AVVAAPIPipelineContext *ctx, AVFrame *frame)
+{
+ return vaapi_get_surface(ctx, ctx->input, ctx->input_surfaces, frame);
+}
+
+int av_vaapi_get_output_surface(AVVAAPIPipelineContext *ctx, AVFrame
*frame)
+{
+ return vaapi_get_surface(ctx, ctx->output, ctx->output_surfaces,
frame);
+}
+
+
+int av_vaapi_map_surface(AVVAAPISurface *surface, int get)
+{
+ AVVAAPIInstance *instance = surface->instance;
+ AVVAAPISurfaceConfig *config = surface->config;
+ VAStatus vas;
+ int err;
+ void *address;
+ // On current Intel drivers, derive gives you memory which is very slow
+ // to read (uncached?). It can be better for write-only cases, but for
+ // now play it safe and never use derive.
+ int derive = 0;
+
+ vas = vaSyncSurface(instance->display, surface->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to sync surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+
+ if(derive) {
+ vas = vaDeriveImage(instance->display,
+ surface->id, &surface->image);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to derive image from
surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ derive = 0;
+ }
+ }
+ if(!derive) {
+ vas = vaCreateImage(instance->display,
+ &config->image_format,
+ config->width, config->height,
+ &surface->image);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to create image for
surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+
+ if(get) {
+ vas = vaGetImage(instance->display,
+ surface->id, 0, 0,
+ config->width, config->height,
+ surface->image.image_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to get image for
surface "
+ "%#x: %d (%s).\n", surface->id, vas,
vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail_image;
+ }
+ }
+ }
+
+ av_assert0(surface->image.format.fourcc ==
config->image_format.fourcc);
+
+ vas = vaMapBuffer(instance->display,
+ surface->image.buf, &address);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to map image from surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail_image;
+ }
+
+ surface->mapped_address = address;
+
+ return 0;
+
+ fail_image:
+ vas = vaDestroyImage(instance->display, surface->image.image_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to destroy image for
surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ }
+ fail:
+ return err;
+}
+
+int av_vaapi_unmap_surface(AVVAAPISurface *surface, int put)
+{
+ AVVAAPIInstance *instance = surface->instance;
+ AVVAAPISurfaceConfig *config = surface->config;
+ VAStatus vas;
+ int derive = 0;
+
+ surface->mapped_address = 0;
+
+ vas = vaUnmapBuffer(instance->display,
+ surface->image.buf);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to unmap image from
surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ }
+
+ if(!derive && put) {
+ vas = vaPutImage(instance->display, surface->id,
+ surface->image.image_id,
+ 0, 0, config->width, config->height,
+ 0, 0, config->width, config->height);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to put image for
surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ }
+ }
+
+ vas = vaDestroyImage(instance->display,
+ surface->image.image_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to destroy image for
surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ }
+
+ return 0;
+}
+
+int av_vaapi_copy_to_surface(const AVFrame *f, AVVAAPISurface *surface)
+{
+ VAImage *image = &surface->image;
+ char *data = surface->mapped_address;
+ av_assert0(data);
+
+ switch(f->format) {
+
+ case AV_PIX_FMT_YUV420P:
+ av_assert0(image->format.fourcc == VA_FOURCC_YV12);
+ av_image_copy_plane(data + image->offsets[0], image->pitches[0],
+ f->data[0], f->linesize[0],
+ f->width, f->height);
+ av_image_copy_plane(data + image->offsets[1], image->pitches[1],
+ f->data[2], f->linesize[2],
+ f->width / 2, f->height / 2);
+ av_image_copy_plane(data + image->offsets[2], image->pitches[2],
+ f->data[1], f->linesize[1],
+ f->width / 2, f->height / 2);
+ break;
+
+ case AV_PIX_FMT_NV12:
+ av_assert0(image->format.fourcc == VA_FOURCC_NV12);
+ av_image_copy_plane(data + image->offsets[0], image->pitches[0],
+ f->data[0], f->linesize[0],
+ f->width, f->height);
+ av_image_copy_plane(data + image->offsets[1], image->pitches[1],
+ f->data[1], f->linesize[1],
+ f->width, f->height / 2);
+ break;
+
+ case AV_PIX_FMT_BGR0:
+ av_assert0(image->format.fourcc == VA_FOURCC_BGRX);
+ av_image_copy_plane(data + image->offsets[0], image->pitches[0],
+ f->data[0], f->linesize[0],
+ f->width * 4, f->height);
+ break;
+
+ default:
+ return AVERROR(EINVAL);
+ }
+
+ return 0;
+}
+
+int av_vaapi_copy_from_surface(AVFrame *f, AVVAAPISurface *surface)
+{
+ VAImage *image = &surface->image;
+ char *data = surface->mapped_address;
+ av_assert0(data);
+
+ switch(f->format) {
+
+ case AV_PIX_FMT_YUV420P:
+ av_assert0(image->format.fourcc == VA_FOURCC_YV12);
+ av_image_copy_plane(f->data[0], f->linesize[0],
+ data + image->offsets[0], image->pitches[0],
+ f->width, f->height);
+ // Um, apparently these are not the same way round...
+ av_image_copy_plane(f->data[2], f->linesize[2],
+ data + image->offsets[1], image->pitches[1],
+ f->width / 2, f->height / 2);
+ av_image_copy_plane(f->data[1], f->linesize[1],
+ data + image->offsets[2], image->pitches[2],
+ f->width / 2, f->height / 2);
+ break;
+
+ case AV_PIX_FMT_NV12:
+ av_assert0(image->format.fourcc == VA_FOURCC_NV12);
+ av_image_copy_plane(f->data[0], f->linesize[0],
+ data + image->offsets[0], image->pitches[0],
+ f->width, f->height);
+ av_image_copy_plane(f->data[1], f->linesize[1],
+ data + image->offsets[1], image->pitches[1],
+ f->width, f->height / 2);
+ break;
+
+ default:
+ return AVERROR(EINVAL);
+ }
+
+ return 0;
+}
diff --git a/libavutil/vaapi.h b/libavutil/vaapi.h
new file mode 100644
index 0000000..18e15ce
--- /dev/null
+++ b/libavutil/vaapi.h
@@ -0,0 +1,116 @@
+/*
+ * VAAPI helper functions.
+ *
+ * Copyright (C) 2016 Mark Thompson <***@jkqxz.net>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301 USA
+ */
+
+#ifndef LIBAVUTIL_VAAPI_H_
+#define LIBAVUTIL_VAAPI_H_
+
+#include <va/va.h>
+
+#include "pixfmt.h"
+#include "frame.h"
+
+
+typedef struct AVVAAPIInstance {
+ VADisplay display;
+
+ void *connection;
+} AVVAAPIInstance;
+
+
+int av_vaapi_instance_init(AVVAAPIInstance *ctx, const char *device);
+int av_vaapi_instance_uninit(AVVAAPIInstance *ctx);
+
+
+#define AV_VAAPI_MAX_SURFACES 64
+
+
+typedef struct AVVAAPISurfaceConfig {
+ enum AVPixelFormat av_format;
+ unsigned int rt_format;
+ VAImageFormat image_format;
+
+ unsigned int count;
+ unsigned int width;
+ unsigned int height;
+
+ unsigned int attribute_count;
+ VASurfaceAttrib *attributes;
+} AVVAAPISurfaceConfig;
+
+typedef struct AVVAAPISurface {
+ VASurfaceID id;
+ int refcount;
+
+ VAImage image;
+ void *mapped_address;
+
+ AVVAAPIInstance *instance;
+ AVVAAPISurfaceConfig *config;
+} AVVAAPISurface;
+
+
+typedef struct AVVAAPIPipelineConfig {
+ VAProfile profile;
+ VAEntrypoint entrypoint;
+
+ unsigned int attribute_count;
+ VAConfigAttrib *attributes;
+} AVVAAPIPipelineConfig;
+
+typedef struct AVVAAPIPipelineContext {
+ const AVClass *class;
+
+ AVVAAPIInstance *instance;
+ AVVAAPIPipelineConfig *config;
+ AVVAAPISurfaceConfig *input;
+ AVVAAPISurfaceConfig *output;
+
+ VAConfigID config_id;
+ VAContextID context_id;
+
+ AVVAAPISurface *input_surfaces;
+ VASurfaceID input_surface_ids[AV_VAAPI_MAX_SURFACES];
+
+ AVVAAPISurface *output_surfaces;
+ VASurfaceID output_surface_ids[AV_VAAPI_MAX_SURFACES];
+} AVVAAPIPipelineContext;
+
+
+int av_vaapi_pipeline_init(AVVAAPIPipelineContext *ctx,
+ AVVAAPIInstance *instance,
+ AVVAAPIPipelineConfig *config,
+ AVVAAPISurfaceConfig *input,
+ AVVAAPISurfaceConfig *output);
+int av_vaapi_pipeline_uninit(AVVAAPIPipelineContext *ctx);
+
+int av_vaapi_get_input_surface(AVVAAPIPipelineContext *ctx, AVFrame
*frame);
+int av_vaapi_get_output_surface(AVVAAPIPipelineContext *ctx, AVFrame
*frame);
+
+int av_vaapi_map_surface(AVVAAPISurface *surface, int get);
+int av_vaapi_unmap_surface(AVVAAPISurface *surface, int put);
+
+
+int av_vaapi_copy_to_surface(const AVFrame *f, AVVAAPISurface *surface);
+int av_vaapi_copy_from_surface(AVFrame *f, AVVAAPISurface *surface);
+
+
+#endif /* LIBAVUTIL_VAAPI_H_ */
--
2.6.4
wm4
2016-01-17 17:53:34 UTC
Permalink
On Sun, 17 Jan 2016 17:34:55 +0000
Post by Mark Thompson
From 2442c1aca8778167c2e60a34d03ed452737f1366 Mon Sep 17 00:00:00 2001
Date: Sun, 17 Jan 2016 15:48:54 +0000
Subject: [PATCH 1/5] libavutil: Some VAAPI infrastructure
+
+static AVVAAPIConnection *av_vaapi_connection_list;
+
+int av_vaapi_instance_init(AVVAAPIInstance *instance, const char *device)
+{
+ AVVAAPIConnection *ctx;
+ int err;
+
+ for(ctx = av_vaapi_connection_list; ctx; ctx = ctx->next) {
+ if((device == 0 && ctx->device_string == 0) ||
+ (device && ctx->device_string &&
+ !strcmp(device, ctx->device_string)))
+ break;
+ }
This won't work. Neither vaapi nor your patch are thread-safe, yet you
have them as very central global mutable state.
Mark Thompson
2016-01-17 18:13:50 UTC
Permalink
Post by wm4
On Sun, 17 Jan 2016 17:34:55 +0000
Post by Mark Thompson
From 2442c1aca8778167c2e60a34d03ed452737f1366 Mon Sep 17 00:00:00 2001
Date: Sun, 17 Jan 2016 15:48:54 +0000
Subject: [PATCH 1/5] libavutil: Some VAAPI infrastructure
+
+static AVVAAPIConnection *av_vaapi_connection_list;
+
+int av_vaapi_instance_init(AVVAAPIInstance *instance, const char *device)
+{
+ AVVAAPIConnection *ctx;
+ int err;
+
+ for(ctx = av_vaapi_connection_list; ctx; ctx = ctx->next) {
+ if((device == 0 && ctx->device_string == 0) ||
+ (device && ctx->device_string &&
+ !strcmp(device, ctx->device_string)))
+ break;
+ }
This won't work. Neither vaapi nor your patch are thread-safe, yet you
have them as very central global mutable state.
True. That setup is all pretty nasty, and everything currently assumes
it happens on the same thread. Since multiple instances have to use a
common connection to libva (because they have to be able to pass
surfaces between them), this is unfortunately pretty much required.

If multithreaded use is desirable immediately then we could just have a
big lock which anything VAAPI-related must take when it wants to do
anything? (This would require changes to all existing VAAPI decoders as
well.)

- Mark
Hendrik Leppkes
2016-01-17 18:23:41 UTC
Permalink
Post by wm4
On Sun, 17 Jan 2016 17:34:55 +0000
Post by Mark Thompson
From 2442c1aca8778167c2e60a34d03ed452737f1366 Mon Sep 17 00:00:00 2001
Date: Sun, 17 Jan 2016 15:48:54 +0000
Subject: [PATCH 1/5] libavutil: Some VAAPI infrastructure
+
+static AVVAAPIConnection *av_vaapi_connection_list;
+
+int av_vaapi_instance_init(AVVAAPIInstance *instance, const char *device)
+{
+ AVVAAPIConnection *ctx;
+ int err;
+
+ for(ctx = av_vaapi_connection_list; ctx; ctx = ctx->next) {
+ if((device == 0 && ctx->device_string == 0) ||
+ (device && ctx->device_string &&
+ !strcmp(device, ctx->device_string)))
+ break;
+ }
This won't work. Neither vaapi nor your patch are thread-safe, yet you
have them as very central global mutable state.
True. That setup is all pretty nasty, and everything currently assumes it
happens on the same thread. Since multiple instances have to use a common
connection to libva (because they have to be able to pass surfaces between
them), this is unfortunately pretty much required.
If multithreaded use is desirable immediately then we could just have a big
lock which anything VAAPI-related must take when it wants to do anything?
(This would require changes to all existing VAAPI decoders as well.)
static variables (ie. global state) are undesirable as a concept entirely.
Applications that want to setup a chain with pass through should
manage the needed connection and make it available to each component
needing access to it.

- Hendrik
wm4
2016-01-17 18:46:11 UTC
Permalink
On Sun, 17 Jan 2016 18:13:50 +0000
Post by Mark Thompson
Post by wm4
On Sun, 17 Jan 2016 17:34:55 +0000
Post by Mark Thompson
From 2442c1aca8778167c2e60a34d03ed452737f1366 Mon Sep 17 00:00:00 2001
Date: Sun, 17 Jan 2016 15:48:54 +0000
Subject: [PATCH 1/5] libavutil: Some VAAPI infrastructure
+
+static AVVAAPIConnection *av_vaapi_connection_list;
+
+int av_vaapi_instance_init(AVVAAPIInstance *instance, const char *device)
+{
+ AVVAAPIConnection *ctx;
+ int err;
+
+ for(ctx = av_vaapi_connection_list; ctx; ctx = ctx->next) {
+ if((device == 0 && ctx->device_string == 0) ||
+ (device && ctx->device_string &&
+ !strcmp(device, ctx->device_string)))
+ break;
+ }
This won't work. Neither vaapi nor your patch are thread-safe, yet you
have them as very central global mutable state.
True. That setup is all pretty nasty, and everything currently assumes
it happens on the same thread. Since multiple instances have to use a
common connection to libva (because they have to be able to pass
surfaces between them), this is unfortunately pretty much required.
If multithreaded use is desirable immediately then we could just have a
big lock which anything VAAPI-related must take when it wants to do
anything? (This would require changes to all existing VAAPI decoders as
well.)
There are two issues:
1. global state in libav* which is not synchronized
2. thread-safety within

1. is is completely unacceptable, because it can trigger undefined
behavior if there is more than 1 libav* user in the same process. I'm
not really convinced that a "device string" is really reliably unique
enough that it won't be a problem across library users. (For example,
it's entirely possible enough to open 2 X11 Displays to the same X
server using the same display name.)

With 2. it's a bit more complicated. There should probably indeed be
something like a big lock around all uses of the same VADisplay, as
long as libva exhibits this problem.
Mark Thompson
2016-01-17 19:46:43 UTC
Permalink
Post by wm4
1. global state in libav* which is not synchronized
2. thread-safety within
1. is is completely unacceptable, because it can trigger undefined
behavior if there is more than 1 libav* user in the same process. I'm
not really convinced that a "device string" is really reliably unique
enough that it won't be a problem across library users. (For example,
it's entirely possible enough to open 2 X11 Displays to the same X
server using the same display name.)
Ok, I'm happy with the first part of that (and that it is fixable by a
simple lock around the connection initialisation, assuming this code
stays in libavutil).

Can you offer an example where the device strings actually create a problem?

Multiple users within the same process /must/ be given the same
connection if they ask for the same device, because we have no way to
distinguish different sets of instances which want to be able to work
together. Equally, two connections to the same device under different
names are acceptably different, because they won't have come from the
same instance set.
Post by wm4
With 2. it's a bit more complicated. There should probably indeed be
something like a big lock around all uses of the same VADisplay, as
long as libva exhibits this problem.
This is straightforward to do, if tedious.

Can you explain the ABI and API constraints on changes to existing
structures?

For the existing decoders (and their users) to work, it will require either:
(a) a global list of connections somewhere to map VADisplay to lock
or
(b) an additional member in struct vaapi_context to point to the lock.

If ABI and API compatibility is required for existing users then (b) is
out, and we have to have the global list (suitably locked).

If we can break both then the right answer is probably to pass
hwaccel_context to encoders as well, and add a similar field to
AVFilterContext to use there too.

If ABI compatibility is required but an API break is allowed then we
could do horrible things to hack (b) into working. For example, replace
the VADisplay pointer in the first member of struct vaapi_context to
instead point at a new structure which contains some magic bytes at the
start. If the magic bytes are where that pointer goes then we are using
the new API and can lock using that, and if they are not found then it
was a user-provided VADisplay and no locking is required.

- Mark


PS: I have no attachment to this piece of code (around connection
initialisation) at all; it was just required to make everything else
work. If you want to suggest a better and completely different approach
then I am happy to throw it all away and start again.
Mark Thompson
2016-01-17 19:52:12 UTC
Permalink
Post by Mark Thompson
Post by wm4
1. global state in libav* which is not synchronized
2. thread-safety within
1. is is completely unacceptable, because it can trigger undefined
behavior if there is more than 1 libav* user in the same process. I'm
not really convinced that a "device string" is really reliably unique
enough that it won't be a problem across library users. (For example,
it's entirely possible enough to open 2 X11 Displays to the same X
server using the same display name.)
Ok, I'm happy with the first part of that (and that it is fixable by a
simple lock around the connection initialisation, assuming this code
stays in libavutil).
Can you offer an example where the device strings actually create a problem?
Multiple users within the same process /must/ be given the same
connection if they ask for the same device, because we have no way to
distinguish different sets of instances which want to be able to work
together. Equally, two connections to the same device under different
names are acceptably different, because they won't have come from the
same instance set.
Right, I see the problem. The user will want to do something with the
surface they get back under the same X11 display handle. We can't call
XOpenDisplay() in that case: the user has to be able to pass their own
handle in. So we need some other way to register that connection.
Post by Mark Thompson
Post by wm4
With 2. it's a bit more complicated. There should probably indeed be
something like a big lock around all uses of the same VADisplay, as
long as libva exhibits this problem.
This is straightforward to do, if tedious.
Can you explain the ABI and API constraints on changes to existing
structures?
(a) a global list of connections somewhere to map VADisplay to lock
or
(b) an additional member in struct vaapi_context to point to the lock.
If ABI and API compatibility is required for existing users then (b) is
out, and we have to have the global list (suitably locked).
If we can break both then the right answer is probably to pass
hwaccel_context to encoders as well, and add a similar field to
AVFilterContext to use there too.
If ABI compatibility is required but an API break is allowed then we
could do horrible things to hack (b) into working. For example, replace
the VADisplay pointer in the first member of struct vaapi_context to
instead point at a new structure which contains some magic bytes at the
start. If the magic bytes are where that pointer goes then we are using
the new API and can lock using that, and if they are not found then it
was a user-provided VADisplay and no locking is required.
- Mark
PS: I have no attachment to this piece of code (around connection
initialisation) at all; it was just required to make everything else
work. If you want to suggest a better and completely different approach
then I am happy to throw it all away and start again.
Xiaolei Yu
2016-01-18 08:53:40 UTC
Permalink
Post by Mark Thompson
Post by wm4
1. global state in libav* which is not synchronized
2. thread-safety within
1. is is completely unacceptable, because it can trigger undefined
behavior if there is more than 1 libav* user in the same process. I'm
not really convinced that a "device string" is really reliably unique
enough that it won't be a problem across library users. (For example,
it's entirely possible enough to open 2 X11 Displays to the same X
server using the same display name.)
Ok, I'm happy with the first part of that (and that it is fixable by a
simple lock around the connection initialisation, assuming this code
stays in libavutil).
Can you offer an example where the device strings actually create a problem?
Multiple users within the same process /must/ be given the same
connection if they ask for the same device, because we have no way to
distinguish different sets of instances which want to be able to work
together. Equally, two connections to the same device under different
names are acceptably different, because they won't have come from the
same instance set.
Right, I see the problem. The user will want to do something with the surface they get back under the same X11 display handle. We can't call XOpenDisplay() in that case: the user has to be able to pass their own handle in. So we need some other way to register that connection.
Post by Mark Thompson
Post by wm4
With 2. it's a bit more complicated. There should probably indeed be
something like a big lock around all uses of the same VADisplay, as
long as libva exhibits this problem.
This is straightforward to do, if tedious.
Can you explain the ABI and API constraints on changes to existing
structures?
(a) a global list of connections somewhere to map VADisplay to lock
or
(b) an additional member in struct vaapi_context to point to the lock.
If ABI and API compatibility is required for existing users then (b) is
out, and we have to have the global list (suitably locked).
If we can break both then the right answer is probably to pass
hwaccel_context to encoders as well, and add a similar field to
AVFilterContext to use there too.
If ABI compatibility is required but an API break is allowed then we
could do horrible things to hack (b) into working. For example, replace
the VADisplay pointer in the first member of struct vaapi_context to
instead point at a new structure which contains some magic bytes at the
start. If the magic bytes are where that pointer goes then we are using
the new API and can lock using that, and if they are not found then it
was a user-provided VADisplay and no locking is required.
- Mark
PS: I have no attachment to this piece of code (around connection
initialisation) at all; it was just required to make everything else
work. If you want to suggest a better and completely different approach
then I am happy to throw it all away and start again.
I think you can supply VADisplay to AVCodecContext through av_opt_ptr and
leave its initialization to user.
_______________________________________________
ffmpeg-devel mailing list
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Mark Thompson
2016-01-17 17:35:36 UTC
Permalink
From 390d4fdacbc2954489f6baa44190e7bf2f2621cc Mon Sep 17 00:00:00 2001
From: Mark Thompson <***@jkqxz.net>
Date: Sun, 17 Jan 2016 15:55:32 +0000
Subject: [PATCH 2/5] ffmpeg: hwaccel helper for VAAPI decode

---
Makefile | 1 +
ffmpeg.h | 2 +
ffmpeg_opt.c | 3 +
ffmpeg_vaapi.c | 523
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 529 insertions(+)
create mode 100644 ffmpeg_vaapi.c

diff --git a/Makefile b/Makefile
index 7836a20..be1d2ca 100644
--- a/Makefile
+++ b/Makefile
@@ -36,6 +36,7 @@ OBJS-ffmpeg-$(CONFIG_VDA) += ffmpeg_videotoolbox.o
endif
OBJS-ffmpeg-$(CONFIG_VIDEOTOOLBOX) += ffmpeg_videotoolbox.o
OBJS-ffmpeg-$(CONFIG_LIBMFX) += ffmpeg_qsv.o
+OBJS-ffmpeg-$(CONFIG_VAAPI) += ffmpeg_vaapi.o
OBJS-ffserver += ffserver_config.o

TESTTOOLS = audiogen videogen rotozoom tiny_psnr tiny_ssim base64
diff --git a/ffmpeg.h b/ffmpeg.h
index 20322b0..d7313c3 100644
--- a/ffmpeg.h
+++ b/ffmpeg.h
@@ -65,6 +65,7 @@ enum HWAccelID {
HWACCEL_VDA,
HWACCEL_VIDEOTOOLBOX,
HWACCEL_QSV,
+ HWACCEL_VAAPI,
};

typedef struct HWAccel {
@@ -577,5 +578,6 @@ int vda_init(AVCodecContext *s);
int videotoolbox_init(AVCodecContext *s);
int qsv_init(AVCodecContext *s);
int qsv_transcode_init(OutputStream *ost);
+int vaapi_decode_init(AVCodecContext *s);

#endif /* FFMPEG_H */
diff --git a/ffmpeg_opt.c b/ffmpeg_opt.c
index 9b341cf..394f2cb 100644
--- a/ffmpeg_opt.c
+++ b/ffmpeg_opt.c
@@ -82,6 +82,9 @@ const HWAccel hwaccels[] = {
#if CONFIG_LIBMFX
{ "qsv", qsv_init, HWACCEL_QSV, AV_PIX_FMT_QSV },
#endif
+#if CONFIG_VAAPI
+ { "vaapi", vaapi_decode_init, HWACCEL_VAAPI, AV_PIX_FMT_VAAPI },
+#endif
{ 0 },
};

diff --git a/ffmpeg_vaapi.c b/ffmpeg_vaapi.c
new file mode 100644
index 0000000..5feb704
--- /dev/null
+++ b/ffmpeg_vaapi.c
@@ -0,0 +1,523 @@
+/*
+ * VAAPI helper for hardware-accelerated decoding.
+ *
+ * Copyright (C) 2016 Mark Thompson <***@jkqxz.net>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301 USA
+ */
+
+#include <va/va.h>
+
+#include "ffmpeg.h"
+
+#include "libavutil/avassert.h"
+#include "libavutil/avconfig.h"
+#include "libavutil/buffer.h"
+#include "libavutil/frame.h"
+#include "libavutil/imgutils.h"
+#include "libavutil/pixfmt.h"
+#include "libavutil/vaapi.h"
+
+#include "libavcodec/vaapi.h"
+
+
+static const AVClass vaapi_class = {
+ .class_name = "VAAPI",
+ .item_name = av_default_item_name,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+
+#define DEFAULT_SURFACES 20
+
+typedef struct VAAPIDecoderContext {
+ const AVClass *class;
+
+ AVVAAPIInstance va_instance;
+ AVVAAPIPipelineConfig config;
+ AVVAAPIPipelineContext codec;
+ AVVAAPISurfaceConfig output;
+
+ int codec_initialised;
+
+ AVFrame output_frame;
+
+ struct vaapi_context hwaccel_context;
+} VAAPIDecoderContext;
+
+
+static int vaapi_get_buffer(AVCodecContext *s, AVFrame *frame, int flags)
+{
+ InputStream *ist = s->opaque;
+ VAAPIDecoderContext *ctx = ist->hwaccel_ctx;
+
+ av_assert0(frame->format == AV_PIX_FMT_VAAPI);
+
+ return av_vaapi_get_output_surface(&ctx->codec, frame);
+}
+
+static int vaapi_retrieve_data(AVCodecContext *avctx, AVFrame *input_frame)
+{
+ InputStream *ist = avctx->opaque;
+ VAAPIDecoderContext *ctx = ist->hwaccel_ctx;
+ AVVAAPISurfaceConfig *output = &ctx->output;
+ AVVAAPISurface *surface;
+ AVFrame *output_frame;
+ int err, copying;
+
+ surface = (AVVAAPISurface*)input_frame->buf[0]->data;
+ av_log(ctx, AV_LOG_DEBUG, "Retrieve data from surface %#x (format
%#x).\n",
+ surface->id, output->av_format);
+
+ if(output->av_format == AV_PIX_FMT_VAAPI) {
+ copying = 0;
+ av_log(ctx, AV_LOG_VERBOSE, "Surface %#x retrieved without
copy.\n",
+ surface->id);
+
+ } else {
+ err = av_vaapi_map_surface(surface, 1);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to map surface %#x.",
+ surface->id);
+ goto fail;
+ }
+
+ copying = 1;
+ av_log(ctx, AV_LOG_VERBOSE, "Surface %#x mapped: image %#x data
%p.\n",
+ surface->id, surface->image.image_id,
surface->mapped_address);
+ }
+
+ // The actual frame need not fill the surface.
+ av_assert0(input_frame->width <= output->width);
+ av_assert0(input_frame->height <= output->height);
+
+ output_frame = &ctx->output_frame;
+ output_frame->width = input_frame->width;
+ output_frame->height = input_frame->height;
+ output_frame->format = output->av_format;
+
+ if(copying) {
+ err = av_frame_get_buffer(output_frame, 32);
+ if(err < 0) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to get output buffer: %d
(%s).\n",
+ err, av_err2str(err));
+ goto fail_unmap;
+ }
+
+ err = av_vaapi_copy_from_surface(output_frame, surface);
+ if(err < 0) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to copy frame data: %d
(%s).\n",
+ err, av_err2str(err));
+ goto fail_unmap;
+ }
+
+ } else {
+ // Just copy the hidden ID field.
+ output_frame->data[3] = input_frame->data[3];
+ }
+
+ err = av_frame_copy_props(output_frame, input_frame);
+ if(err < 0) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to copy frame props: %d (%s).\n",
+ err, av_err2str(err));
+ goto fail_unmap;
+ }
+
+ av_frame_unref(input_frame);
+ av_frame_move_ref(input_frame, output_frame);
+
+ fail_unmap:
+ if(copying)
+ av_vaapi_unmap_surface(surface, 0);
+ fail:
+ return err;
+}
+
+static const struct {
+ VAProfile from;
+ VAProfile to;
+} vaapi_profile_compatibility[] = {
+#define FT(f, t) { VAProfile ## f, VAProfile ## t }
+ FT(MPEG2Simple, MPEG2Main ),
+ FT(H263Baseline, MPEG4AdvancedSimple),
+ FT(MPEG4Simple, MPEG4AdvancedSimple),
+ FT(MPEG4AdvancedSimple, MPEG4Main ),
+ FT(H264ConstrainedBaseline, H264Baseline),
+ FT(H264Baseline, H264Main ), // (Not quite true.)
+ FT(H264Main, H264High ),
+ FT(VC1Simple, VC1Main ),
+ FT(VC1Main, VC1Advanced ),
+#undef FT
+};
+
+static int vaapi_find_next_compatible(VAProfile profile)
+{
+ int i;
+ for(i = 0; i < FF_ARRAY_ELEMS(vaapi_profile_compatibility); i++) {
+ if(vaapi_profile_compatibility[i].from == profile)
+ return vaapi_profile_compatibility[i].to;
+ }
+ return VAProfileNone;
+}
+
+static const struct {
+ enum AVCodecID codec_id;
+ int codec_profile;
+ VAProfile va_profile;
+} vaapi_profile_map[] = {
+#define MAP(c, p, v) { AV_CODEC_ID_ ## c, FF_PROFILE_ ## p, VAProfile
## v }
+ MAP(MPEG2VIDEO, MPEG2_SIMPLE, MPEG2Simple ),
+ MAP(MPEG2VIDEO, MPEG2_MAIN, MPEG2Main ),
+ MAP(H263, UNKNOWN, H263Baseline),
+ MAP(MPEG4, MPEG4_SIMPLE, MPEG4Simple ),
+ MAP(MPEG4, MPEG4_ADVANCED_SIMPLE,
+ MPEG4AdvancedSimple),
+ MAP(MPEG4, MPEG4_MAIN, MPEG4Main ),
+ MAP(H264, H264_CONSTRAINED_BASELINE,
+ H264ConstrainedBaseline),
+ MAP(H264, H264_BASELINE, H264Baseline),
+ MAP(H264, H264_MAIN, H264Main ),
+ MAP(H264, H264_HIGH, H264High ),
+ MAP(HEVC, HEVC_MAIN, HEVCMain ),
+ MAP(WMV3, VC1_SIMPLE, VC1Simple ),
+ MAP(WMV3, VC1_MAIN, VC1Main ),
+ MAP(WMV3, VC1_COMPLEX, VC1Advanced ),
+ MAP(WMV3, VC1_ADVANCED, VC1Advanced ),
+ MAP(VC1, VC1_SIMPLE, VC1Simple ),
+ MAP(VC1, VC1_MAIN, VC1Main ),
+ MAP(VC1, VC1_COMPLEX, VC1Advanced ),
+ MAP(VC1, VC1_ADVANCED, VC1Advanced ),
+ MAP(MJPEG, UNKNOWN, JPEGBaseline),
+ MAP(VP8, UNKNOWN, VP8Version0_3),
+ MAP(VP9, VP9_0, VP9Profile0 ),
+#undef MAP
+};
+
+static VAProfile vaapi_find_profile(const AVCodecContext *avctx)
+{
+ VAProfile result = VAProfileNone;
+ int i;
+ for(i = 0; i < FF_ARRAY_ELEMS(vaapi_profile_map); i++) {
+ if(avctx->codec_id != vaapi_profile_map[i].codec_id)
+ continue;
+ result = vaapi_profile_map[i].va_profile;
+ if(avctx->profile == vaapi_profile_map[i].codec_profile)
+ break;
+ // If there isn't an exact match, we will choose the last (highest)
+ // profile in the mapping table.
+ }
+ return result;
+}
+
+static const struct {
+ enum AVPixelFormat pix_fmt;
+ unsigned int fourcc;
+} vaapi_image_formats[] = {
+ { AV_PIX_FMT_NV12, VA_FOURCC_NV12 },
+ { AV_PIX_FMT_YUV420P, VA_FOURCC_YV12 },
+};
+
+static int vaapi_get_pix_fmt(unsigned int fourcc)
+{
+ int i;
+ for(i = 0; i < FF_ARRAY_ELEMS(vaapi_image_formats); i++)
+ if(vaapi_image_formats[i].fourcc == fourcc)
+ return vaapi_image_formats[i].pix_fmt;
+ return 0;
+}
+
+static int vaapi_build_decoder_config(VAAPIDecoderContext *ctx,
+ AVVAAPIPipelineConfig *config,
+ AVVAAPISurfaceConfig *output,
+ AVCodecContext *avctx)
+{
+ VAStatus vas;
+ int i;
+
+ memset(config, 0, sizeof(*config));
+
+ // Pick codec profile to use.
+ {
+ VAProfile best_profile, profile;
+ int profile_count;
+ VAProfile *profile_list;
+
+ best_profile = vaapi_find_profile(avctx);
+ if(best_profile == VAProfileNone) {
+ av_log(ctx, AV_LOG_ERROR, "VAAPI does not support codec %s.\n",
+ avcodec_get_name(avctx->codec_id));
+ return AVERROR(EINVAL);
+ }
+
+ profile_count = vaMaxNumProfiles(ctx->va_instance.display);
+ profile_list = av_calloc(profile_count, sizeof(VAProfile));
+
+ vas = vaQueryConfigProfiles(ctx->va_instance.display,
+ profile_list, &profile_count);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to query profiles: %d
(%s).\n",
+ vas, vaErrorStr(vas));
+ av_free(profile_list);
+ return AVERROR(EINVAL);
+ }
+
+ profile = best_profile;
+ while(profile != VAProfileNone) {
+ for(i = 0; i < profile_count; i++) {
+ if(profile_list[i] == profile)
+ break;
+ }
+ if(i < profile_count)
+ break;
+
+ av_log(ctx, AV_LOG_DEBUG, "Hardware does not support codec "
+ "profile: %s / %d -> VAProfile %d.\n",
+ avcodec_get_name(avctx->codec_id), avctx->profile,
+ profile);
+ profile = vaapi_find_next_compatible(profile);
+ }
+
+ av_free(profile_list);
+
+ if(profile == VAProfileNone) {
+ av_log(ctx, AV_LOG_ERROR, "Hardware does not support codec: "
+ "%s / %d.\n", avcodec_get_name(avctx->codec_id),
+ avctx->profile);
+ return AVERROR(EINVAL);
+ } else if(profile == best_profile) {
+ av_log(ctx, AV_LOG_INFO, "Hardware supports exact codec: "
+ "%s / %d -> VAProfile %d.\n",
+ avcodec_get_name(avctx->codec_id), avctx->profile,
+ profile);
+ } else {
+ av_log(ctx, AV_LOG_INFO, "Hardware supports compatible codec: "
+ "%s / %d -> VAProfile %d.\n",
+ avcodec_get_name(avctx->codec_id), avctx->profile,
+ profile);
+ }
+
+ config->profile = profile;
+ config->entrypoint = VAEntrypointVLD;
+ }
+
+ // Decide on the internal chroma format.
+ {
+ VAConfigAttrib attr;
+
+ // Currently the software only supports YUV420, so just make sure
+ // that the hardware we have does too.
+
+ memset(&attr, 0, sizeof(attr));
+ attr.type = VAConfigAttribRTFormat;
+ vas = vaGetConfigAttributes(ctx->va_instance.display,
config->profile,
+ VAEntrypointVLD, &attr, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to fetch config attributes: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR(EINVAL);
+ }
+ if(!(attr.value & VA_RT_FORMAT_YUV420)) {
+ av_log(ctx, AV_LOG_ERROR, "Hardware does not support required "
+ "chroma format (%#x).\n", attr.value);
+ return AVERROR(EINVAL);
+ }
+
+ output->rt_format = VA_RT_FORMAT_YUV420;
+ }
+
+ // Decide on the image format.
+ if(avctx->pix_fmt == AV_PIX_FMT_VAAPI) {
+ // We are going to be passing through a VAAPI surface directly:
+ // they will stay as whatever opaque internal format for that time,
+ // and we never need to make VAImages from them.
+
+ av_log(ctx, AV_LOG_INFO, "Using VAAPI opaque output format.\n");
+
+ output->av_format = AV_PIX_FMT_VAAPI;
+ memset(&output->image_format, 0, sizeof(output->image_format));
+
+ } else {
+ int image_format_count;
+ VAImageFormat *image_format_list;
+ int pix_fmt;
+
+ // We might want to force a change to the output format here
+ // if we are intending to use VADeriveImage?
+
+ image_format_count =
vaMaxNumImageFormats(ctx->va_instance.display);
+ image_format_list = av_calloc(image_format_count,
+ sizeof(VAImageFormat));
+
+ vas = vaQueryImageFormats(ctx->va_instance.display,
image_format_list,
+ &image_format_count);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to query image formats: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR(EINVAL);
+ }
+
+ for(i = 0; i < image_format_count; i++) {
+ pix_fmt = vaapi_get_pix_fmt(image_format_list[i].fourcc);
+ if(pix_fmt == AV_PIX_FMT_NONE)
+ continue;
+ if(pix_fmt == avctx->pix_fmt)
+ break;
+ }
+ if(i < image_format_count) {
+ av_log(ctx, AV_LOG_INFO, "Using desired output format %s "
+ "(%#x).\n", av_get_pix_fmt_name(pix_fmt),
+ image_format_list[i].fourcc);
+ } else {
+ for(i = 0; i < image_format_count; i++) {
+ pix_fmt = vaapi_get_pix_fmt(image_format_list[i].fourcc);
+ if(pix_fmt != AV_PIX_FMT_NONE)
+ break;
+ }
+ if(i >= image_format_count) {
+ av_log(ctx, AV_LOG_ERROR, "No supported output format
found.\n");
+ av_free(image_format_list);
+ return AVERROR(EINVAL);
+ }
+ av_log(ctx, AV_LOG_INFO, "Using alternate output format %s "
+ "(%#x).\n", av_get_pix_fmt_name(pix_fmt),
+ image_format_list[i].fourcc);
+ }
+
+ output->av_format = pix_fmt;
+ memcpy(&output->image_format, &image_format_list[i],
+ sizeof(VAImageFormat));
+
+ av_free(image_format_list);
+ }
+
+ // Decide how many reference frames we need.
+ {
+ // We should be able to do this in a more sensible way by looking
+ // at how many reference frames the input stream requires.
+ output->count = DEFAULT_SURFACES;
+ }
+
+ // Test whether the width and height are within allowable limits.
+ {
+ // Unfortunately, we need an active codec pipeline to do this
properly
+ // using vaQuerySurfaceAttributes(). For now, just assume the
values
+ // we got passed are ok.
+ output->width = avctx->coded_width;
+ output->height = avctx->coded_height;
+ }
+
+ return 0;
+}
+
+static int vaapi_alloc_decoder_context(VAAPIDecoderContext **ctx_ptr,
const char *device)
+{
+ VAAPIDecoderContext *ctx;
+ int err;
+
+ ctx = av_mallocz(sizeof(*ctx));
+ if(!ctx)
+ return AVERROR(ENOMEM);
+
+ ctx->class = &vaapi_class;
+
+ err = av_vaapi_instance_init(&ctx->va_instance, device);
+ if(err)
+ return err;
+
+ *ctx_ptr = ctx;
+ return 0;
+}
+
+static void vaapi_decode_uninit(AVCodecContext *avctx)
+{
+ InputStream *ist = avctx->opaque;
+ VAAPIDecoderContext *ctx = ist->hwaccel_ctx;
+
+ if(ctx->codec_initialised) {
+ av_vaapi_pipeline_uninit(&ctx->codec);
+ ctx->codec_initialised = 0;
+ }
+
+ av_free(ctx);
+
+ ist->hwaccel_ctx = 0;
+ ist->hwaccel_uninit = 0;
+ ist->hwaccel_get_buffer = 0;
+ ist->hwaccel_retrieve_data = 0;
+
+ av_vaapi_instance_uninit(&ctx->va_instance);
+}
+
+int vaapi_decode_init(AVCodecContext *avctx)
+{
+ InputStream *ist = avctx->opaque;
+ VAAPIDecoderContext *ctx;
+ int err;
+
+ if(ist->hwaccel_id != HWACCEL_VAAPI)
+ return AVERROR(EINVAL);
+
+ avctx->hwaccel_context = 0;
+
+ if(ist->hwaccel_ctx) {
+ ctx = ist->hwaccel_ctx;
+ err = av_vaapi_pipeline_uninit(&ctx->codec);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Unable to reinit; failed to uninit "
+ "old codec context: %d (%s).\n", err, av_err2str(err));
+ return err;
+ }
+
+ } else {
+ err = vaapi_alloc_decoder_context(&ctx, ist->hwaccel_device);
+ if(err)
+ return err;
+ }
+
+ err = vaapi_build_decoder_config(ctx, &ctx->config, &ctx->output,
avctx);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "No supported configuration for this
codec.");
+ goto fail;
+ }
+
+ err = av_vaapi_pipeline_init(&ctx->codec, &ctx->va_instance,
+ &ctx->config, 0, &ctx->output);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to initialise codec context: "
+ "%d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+ ctx->codec_initialised = 1;
+
+ av_log(ctx, AV_LOG_DEBUG, "VAAPI decoder (re)init complete.\n");
+
+ ist->hwaccel_ctx = ctx;
+ ist->hwaccel_uninit = vaapi_decode_uninit;
+ ist->hwaccel_get_buffer = vaapi_get_buffer;
+ ist->hwaccel_retrieve_data = vaapi_retrieve_data;
+
+ ctx->hwaccel_context.display = ctx->va_instance.display;
+ ctx->hwaccel_context.config_id = ctx->codec.config_id;
+ ctx->hwaccel_context.context_id = ctx->codec.context_id;
+ avctx->hwaccel_context = &ctx->hwaccel_context;
+
+ return 0;
+
+ fail:
+ vaapi_decode_uninit(avctx);
+ return err;
+}
--
2.6.4
Michael Niedermayer
2016-01-17 20:45:59 UTC
Permalink
On Sun, Jan 17, 2016 at 05:35:36PM +0000, Mark Thompson wrote:
[...]
Post by Mark Thompson
+static int vaapi_retrieve_data(AVCodecContext *avctx, AVFrame *input_frame)
+{
+ InputStream *ist = avctx->opaque;
+ VAAPIDecoderContext *ctx = ist->hwaccel_ctx;
+ AVVAAPISurfaceConfig *output = &ctx->output;
+ AVVAAPISurface *surface;
+ AVFrame *output_frame;
+ int err, copying;
+
+ surface = (AVVAAPISurface*)input_frame->buf[0]->data;
+ av_log(ctx, AV_LOG_DEBUG, "Retrieve data from surface %#x
(format %#x).\n",
+ surface->id, output->av_format);
+
+ if(output->av_format == AV_PIX_FMT_VAAPI) {
+ copying = 0;
+ av_log(ctx, AV_LOG_VERBOSE, "Surface %#x retrieved without
copy.\n",
+ surface->id);
+
+ } else {
+ err = av_vaapi_map_surface(surface, 1);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to map surface %#x.",
+ surface->id);
+ goto fail;
+ }
+
+ copying = 1;
+ av_log(ctx, AV_LOG_VERBOSE, "Surface %#x mapped: image %#x
data %p.\n",
+ surface->id, surface->image.image_id,
surface->mapped_address);
+ }
this patch is corrupted by some automatic word/line wrap, newlines

[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

During times of universal deceit, telling the truth becomes a
revolutionary act. -- George Orwell
Mark Thompson
2016-01-17 21:01:42 UTC
Permalink
Post by Michael Niedermayer
this patch is corrupted by some automatic word/line wrap, newlines
How annoying. My mail client renders it as expected, but apparently is
lying.

Raw patches attached instead.

Thanks,

- Mark
Mark Thompson
2016-01-17 17:36:25 UTC
Permalink
From 3a3c668ad55746e7313e6cf2b121a984ac5ca942 Mon Sep 17 00:00:00 2001
From: Mark Thompson <***@jkqxz.net>
Date: Sun, 17 Jan 2016 15:57:55 +0000
Subject: [PATCH 3/5] libavcodec: add VAAPI H.264 encoder

---
configure | 1 +
libavcodec/Makefile | 1 +
libavcodec/allcodecs.c | 1 +
libavcodec/vaapi_enc_h264.c | 944
++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 947 insertions(+)
create mode 100644 libavcodec/vaapi_enc_h264.c

diff --git a/configure b/configure
index 1c77015..a31d65e 100755
--- a/configure
+++ b/configure
@@ -2499,6 +2499,7 @@ h264_mmal_encoder_deps="mmal"
h264_qsv_hwaccel_deps="libmfx"
h264_vaapi_hwaccel_deps="vaapi"
h264_vaapi_hwaccel_select="h264_decoder"
+h264_vaapi_encoder_deps="vaapi"
h264_vda_decoder_deps="vda"
h264_vda_decoder_select="h264_decoder"
h264_vda_hwaccel_deps="vda"
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index b9ffdb9..06b3c48 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -303,6 +303,7 @@ OBJS-$(CONFIG_H264_MMAL_DECODER) += mmaldec.o
OBJS-$(CONFIG_H264_VDA_DECODER) += vda_h264_dec.o
OBJS-$(CONFIG_H264_QSV_DECODER) += qsvdec_h2645.o
OBJS-$(CONFIG_H264_QSV_ENCODER) += qsvenc_h264.o
+OBJS-$(CONFIG_H264_VAAPI_ENCODER) += vaapi_enc_h264.o
OBJS-$(CONFIG_HAP_DECODER) += hapdec.o hap.o
OBJS-$(CONFIG_HAP_ENCODER) += hapenc.o hap.o
OBJS-$(CONFIG_HEVC_DECODER) += hevc.o hevc_mvs.o hevc_ps.o
hevc_sei.o \
diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index 2128546..0d07087 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcodec/allcodecs.c
@@ -199,6 +199,7 @@ void avcodec_register_all(void)
#if FF_API_VDPAU
REGISTER_DECODER(H264_VDPAU, h264_vdpau);
#endif
+ REGISTER_ENCODER(H264_VAAPI, h264_vaapi);
REGISTER_ENCDEC (HAP, hap);
REGISTER_DECODER(HEVC, hevc);
REGISTER_DECODER(HEVC_QSV, hevc_qsv);
diff --git a/libavcodec/vaapi_enc_h264.c b/libavcodec/vaapi_enc_h264.c
new file mode 100644
index 0000000..39c7236
--- /dev/null
+++ b/libavcodec/vaapi_enc_h264.c
@@ -0,0 +1,944 @@
+/*
+ * VAAPI H.264 encoder.
+ *
+ * Copyright (C) 2016 Mark Thompson <***@jkqxz.net>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301 USA
+ */
+
+#include "avcodec.h"
+#include "golomb.h"
+#include "put_bits.h"
+
+#include "h264.h"
+
+#include "libavutil/opt.h"
+#include "libavutil/pixdesc.h"
+#include "libavutil/vaapi.h"
+
+#define DPB_FRAMES 16
+#define INPUT_FRAMES 2
+
+typedef struct VAAPIH264EncodeFrame {
+ AVFrame avframe;
+ VASurfaceID surface_id;
+
+ int frame_num;
+ enum {
+ FRAME_TYPE_I,
+ FRAME_TYPE_P,
+ FRAME_TYPE_B,
+ } type;
+
+ VAPictureH264 pic;
+ VAEncSliceParameterBufferH264 params;
+ VABufferID params_id;
+
+ VABufferID coded_data_id;
+
+ struct VAAPIH264EncodeFrame *refp, *refb;
+} VAAPIH264EncodeFrame;
+
+typedef struct VAAPIH264EncodeContext {
+ const AVClass *class;
+
+ AVVAAPIInstance va_instance;
+ AVVAAPIPipelineConfig va_config;
+ AVVAAPIPipelineContext va_codec;
+
+ AVVAAPISurfaceConfig input_config;
+ AVVAAPISurfaceConfig output_config;
+
+ VAProfile va_profile;
+ int level;
+ int rc_mode;
+ int width;
+ int height;
+
+ VAEncSequenceParameterBufferH264 seq_params;
+ VABufferID seq_params_id;
+
+ VAEncMiscParameterRateControl rc_params;
+ VAEncMiscParameterBuffer rc_params_buffer;
+ VABufferID rc_params_id;
+
+ VAEncPictureParameterBufferH264 pic_params;
+ VABufferID pic_params_id;
+
+ int frame_num;
+
+ VAAPIH264EncodeFrame dpb[DPB_FRAMES];
+ int current_frame;
+ int previous_frame;
+
+ struct {
+ const char *profile;
+ const char *level;
+ int qp;
+ int idr_interval;
+ } options;
+
+} VAAPIH264EncodeContext;
+
+
+static int vaapi_h264_render_packed_header(VAAPIH264EncodeContext *ctx,
int type,
+ char *data, size_t bit_len)
+{
+ VAStatus vas;
+ VABufferID id_list[2];
+ VAEncPackedHeaderParameterBuffer buffer = {
+ .type = type,
+ .bit_length = bit_len,
+ .has_emulation_bytes = 0,
+ };
+
+ vas = vaCreateBuffer(ctx->va_instance.display,
ctx->va_codec.context_id,
+ VAEncPackedHeaderParameterBufferType,
+ sizeof(&buffer), 1, &buffer, &id_list[0]);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create parameter buffer
for packed "
+ "header (type %d): %d (%s).\n", type, vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ vas = vaCreateBuffer(ctx->va_instance.display,
ctx->va_codec.context_id,
+ VAEncPackedHeaderDataBufferType,
+ (bit_len + 7) / 8, 1, data, &id_list[1]);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create data buffer for
packed "
+ "header (type %d): %d (%s).\n", type, vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ vas = vaRenderPicture(ctx->va_instance.display,
ctx->va_codec.context_id,
+ id_list, 2);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to render packed "
+ "header (type %d): %d (%s).\n", type, vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ return 0;
+}
+
+static void vaapi_h264_write_nal_header(PutBitContext *b, int ref, int
type)
+{
+ // zero_byte
+ put_bits(b, 8, 0);
+ // start_code_prefix_one_3bytes
+ put_bits(b, 24, 1);
+ // forbidden_zero_bit
+ put_bits(b, 1, 0);
+ // nal_ref_idc
+ put_bits(b, 2, ref);
+ // nal_unit_type
+ put_bits(b, 5, type);
+}
+
+static void vaapi_h264_write_trailing_rbsp(PutBitContext *b)
+{
+ // rbsp_stop_one_bit
+ put_bits(b, 1, 1);
+ while(put_bits_count(b) & 7) {
+ // rbsp_alignment_zero_bit
+ put_bits(b, 1, 0);
+ }
+}
+
+static int vaapi_h264_render_packed_sps(VAAPIH264EncodeContext *ctx)
+{
+ PutBitContext b;
+ char tmp[256];
+ size_t len;
+
+ init_put_bits(&b, tmp, sizeof(tmp));
+
+ vaapi_h264_write_nal_header(&b, 3, NAL_SPS);
+
+ // profile_idc
+ put_bits(&b, 8, 66);
+ // constraint_set0_flag
+ put_bits(&b, 1, 0);
+ // constraint_set1_flag
+ put_bits(&b, 1, ctx->va_profile == VAProfileH264ConstrainedBaseline);
+ // constraint_set2_flag
+ put_bits(&b, 1, 0);
+ // constraint_set3_flag
+ put_bits(&b, 1, 0);
+ // constraint_set4_flag
+ put_bits(&b, 1, 0);
+ // constraint_set5_flag
+ put_bits(&b, 1, 0);
+ // reserved_zero_2bits
+ put_bits(&b, 2, 0);
+ // level_idc
+ put_bits(&b, 8, 52);
+ // seq_parameter_set_id
+ set_ue_golomb(&b, 0);
+
+ if(0) {
+ // chroma_format_idc
+ set_ue_golomb(&b, 1);
+ // bit_depth_luma_minus8
+ set_ue_golomb(&b, 0);
+ // bit_depth_chroma_minus8
+ set_ue_golomb(&b, 0);
+ // qpprime_y_zero_transform_bypass_flag
+ put_bits(&b, 1, 0);
+ // seq_scaling_matrix_present_flag
+ put_bits(&b, 1, 0);
+ }
+
+ // log2_max_frame_num_minus4
+ set_ue_golomb(&b, 4);
+ // pic_order_cnt_type
+ set_ue_golomb(&b, 2);
+
+ // max_num_ref_frames
+ set_ue_golomb(&b, 1);
+ // gaps_in_frame_num_value_allowed_flag
+ put_bits(&b, 1, 0);
+ // pic_width_in_mbs_minus1
+ set_ue_golomb(&b, (ctx->width + 15) / 16 - 1);
+ // pic_height_in_map_units_minus1
+ set_ue_golomb(&b, (ctx->height + 15) / 16 - 1);
+ // frame_mbs_oly_flag
+ put_bits(&b, 1, 1);
+
+ // direct_8x8_inference_flag
+ put_bits(&b, 1, 1);
+ // frame_cropping_flag
+ put_bits(&b, 1, 0);
+
+ // vui_parameters_present_flag
+ put_bits(&b, 1, 0);
+
+ vaapi_h264_write_trailing_rbsp(&b);
+
+ len = put_bits_count(&b);
+ flush_put_bits(&b);
+
+ return vaapi_h264_render_packed_header(ctx, VAEncPackedHeaderSequence,
+ tmp, len);
+}
+
+static int vaapi_h264_render_packed_pps(VAAPIH264EncodeContext *ctx)
+{
+ PutBitContext b;
+ char tmp[256];
+ size_t len;
+
+ init_put_bits(&b, tmp, sizeof(tmp));
+
+ vaapi_h264_write_nal_header(&b, 3, NAL_PPS);
+
+ // seq_parameter_set_id
+ set_ue_golomb(&b, 0);
+ // pic_parameter_set_id
+ set_ue_golomb(&b, 0);
+ // entropy_coding_mode_flag
+ put_bits(&b, 1, 1);
+ // bottom_field_pic_order_in_frame_present_flag
+ put_bits(&b, 1, 0);
+ // num_slice_groups_minus1
+ set_ue_golomb(&b, 0);
+
+ // num_ref_idx_l0_default_active_minus1
+ set_ue_golomb(&b, 0);
+ // num_ref_idx_l1_default_active_minus1
+ set_ue_golomb(&b, 0);
+ // weighted_pred_flag
+ put_bits(&b, 1, 0);
+ // weighted_bipred_idc
+ put_bits(&b, 2, 0);
+ // pic_init_qp_minus26
+ set_se_golomb(&b, ctx->options.qp - 26);
+ // pic_init_qs_minus26
+ set_se_golomb(&b, 0);
+ // chroma_qp_index_offset
+ set_se_golomb(&b, 0);
+ // deblocking_filter_control_present_flag
+ put_bits(&b, 1, 1);
+ // constrained_intra_pred_flag
+ put_bits(&b, 1, 0);
+ // redundant_pic_cnt_present_flag
+ put_bits(&b, 1, 0);
+
+ // transform_8x8_mode_flag
+ put_bits(&b, 1, 0);
+ // pic_scaling_matrix_present_flag
+ put_bits(&b, 1, 0);
+ // second_chroma_qp_index_offset
+ set_se_golomb(&b, 0);
+
+ vaapi_h264_write_trailing_rbsp(&b);
+
+ len = put_bits_count(&b);
+ flush_put_bits(&b);
+
+ return vaapi_h264_render_packed_header(ctx, VAEncPackedHeaderPicture,
+ tmp, len);
+}
+
+static int vaapi_h264_render_packed_slice(VAAPIH264EncodeContext *ctx,
+ VAAPIH264EncodeFrame *current)
+{
+ PutBitContext b;
+ char tmp[256];
+ size_t len;
+
+ init_put_bits(&b, tmp, sizeof(tmp));
+
+ if(current->type == FRAME_TYPE_I)
+ vaapi_h264_write_nal_header(&b, 3, NAL_IDR_SLICE);
+ else
+ vaapi_h264_write_nal_header(&b, 3, NAL_SLICE);
+
+ // first_mb_in_slice
+ set_ue_golomb(&b, 0);
+ // slice_type
+ set_ue_golomb(&b, (current->type == FRAME_TYPE_I ? 2 :
+ current->type == FRAME_TYPE_P ? 0 : 1));
+ // pic_parameter_set_id
+ set_ue_golomb(&b, 0);
+
+ // frame_num
+ put_bits(&b, 8, current->frame_num);
+
+ if(current->type == FRAME_TYPE_I) {
+ // idr_pic_id
+ set_ue_golomb(&b, 0);
+ }
+
+ // pic_order_cnt stuff
+
+ if(current->type == FRAME_TYPE_B) {
+ // direct_spatial_mv_pred_flag
+ put_bits(&b, 1, 1);
+ }
+
+ if(current->type == FRAME_TYPE_P || current->type == FRAME_TYPE_B) {
+ // num_ref_idx_active_override_flag
+ put_bits(&b, 1, 0);
+ if(0) {
+ // num_ref_idx_l0_active_minus1
+ if(current->type == FRAME_TYPE_B) {
+ // num_ref_idx_l1_active_minus1
+ }
+ }
+
+ // ref_pic_list_modification_flag_l0
+ put_bits(&b, 1, 0);
+
+ if(current->type == FRAME_TYPE_B) {
+ // ref_pic_list_modification_flag_l1
+ put_bits(&b, 1, 0);
+ }
+ }
+
+ if(1) {
+ // dec_ref_pic_marking
+ if(current->type == FRAME_TYPE_I) {
+ // no_output_of_prior_pics_flag
+ put_bits(&b, 1, 0);
+ // long_term_reference_flag
+ put_bits(&b, 1, 0);
+ } else {
+ // adaptive_pic_ref_marking_mode_flag
+ put_bits(&b, 1, 0);
+ }
+ }
+
+ if(current->type != FRAME_TYPE_I) {
+ // cabac_init_idc
+ set_ue_golomb(&b, 0);
+ }
+
+ // slice_qp_delta
+ set_se_golomb(&b, 0);
+
+ if(1) {
+ // disable_deblocking_filter_idc
+ set_ue_golomb(&b, 0);
+ // slice_alpha_c0_offset_div2
+ set_se_golomb(&b, 0);
+ // slice_beta_offset_div2
+ set_se_golomb(&b, 0);
+ }
+
+ len = put_bits_count(&b);
+ flush_put_bits(&b);
+
+ return vaapi_h264_render_packed_header(ctx, VAEncPackedHeaderSlice,
+ tmp, len);
+}
+
+static int vaapi_h264_render_sequence(VAAPIH264EncodeContext *ctx)
+{
+ VAStatus vas;
+ VAEncSequenceParameterBufferH264 *seq = &ctx->seq_params;
+
+ {
+ memset(seq, 0, sizeof(*seq));
+
+ seq->level_idc = 52;
+ seq->picture_width_in_mbs = (ctx->width + 15) / 16;
+ seq->picture_height_in_mbs = (ctx->height + 15) / 16;
+
+ seq->intra_period = 0;
+ seq->intra_idr_period = 0;
+ seq->ip_period = 1;
+
+ seq->max_num_ref_frames = 2;
+ seq->time_scale = 900;
+ seq->num_units_in_tick = 15;
+ seq->seq_fields.bits.log2_max_pic_order_cnt_lsb_minus4 = 4;
+ seq->seq_fields.bits.log2_max_frame_num_minus4 = 4;
+ seq->seq_fields.bits.frame_mbs_only_flag = 1;
+ seq->seq_fields.bits.chroma_format_idc = 1;
+ seq->seq_fields.bits.direct_8x8_inference_flag = 1;
+ seq->seq_fields.bits.pic_order_cnt_type = 2;
+
+ seq->frame_cropping_flag = 1;
+ seq->frame_crop_left_offset = 0;
+ seq->frame_crop_right_offset = 0;
+ seq->frame_crop_top_offset = 0;
+ seq->frame_crop_bottom_offset = 8;
+ }
+
+ vas = vaCreateBuffer(ctx->va_instance.display,
ctx->va_codec.context_id,
+ VAEncSequenceParameterBufferType,
+ sizeof(*seq), 1, seq, &ctx->seq_params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for sequence "
+ "parameters: %d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Sequence parameter buffer is %#x.\n",
+ ctx->seq_params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display,
ctx->va_codec.context_id,
+ &ctx->seq_params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to send sequence parameters: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ return 0;
+}
+
+static int vaapi_h264_render_picture(VAAPIH264EncodeContext *ctx,
+ VAAPIH264EncodeFrame *current)
+{
+ VAStatus vas;
+ VAEncPictureParameterBufferH264 *pic = &ctx->pic_params;
+ int i;
+
+ memset(pic, 0, sizeof(*pic));
+ memcpy(&pic->CurrPic, &current->pic, sizeof(VAPictureH264));
+ for(i = 0; i < FF_ARRAY_ELEMS(pic->ReferenceFrames); i++) {
+ pic->ReferenceFrames[i].picture_id = VA_INVALID_ID;
+ pic->ReferenceFrames[i].flags = VA_PICTURE_H264_INVALID;
+ }
+ if(current->type == FRAME_TYPE_P || current->type == FRAME_TYPE_B)
+ memcpy(&pic->ReferenceFrames[0], &current->refp->pic,
+ sizeof(VAPictureH264));
+ if(current->type == FRAME_TYPE_B)
+ memcpy(&pic->ReferenceFrames[1], &current->refb->pic,
+ sizeof(VAPictureH264));
+
+ pic->pic_fields.bits.idr_pic_flag = (current->type == FRAME_TYPE_I);
+ pic->pic_fields.bits.reference_pic_flag = 1;
+ pic->pic_fields.bits.entropy_coding_mode_flag = 1;
+ pic->pic_fields.bits.deblocking_filter_control_present_flag = 1;
+
+ pic->frame_num = current->frame_num;
+ pic->last_picture = 0;
+ pic->pic_init_qp = ctx->options.qp;
+
+ pic->coded_buf = current->coded_data_id;
+
+ vas = vaCreateBuffer(ctx->va_instance.display,
ctx->va_codec.context_id,
+ VAEncPictureParameterBufferType,
+ sizeof(*pic), 1, pic, &ctx->pic_params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for picture "
+ "parameters: %d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Picture parameter buffer is %#x.\n",
+ ctx->pic_params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display,
ctx->va_codec.context_id,
+ &ctx->pic_params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to send picture parameters: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ return 0;
+}
+
+static int vaapi_h264_render_slice(VAAPIH264EncodeContext *ctx,
+ VAAPIH264EncodeFrame *current)
+{
+ VAStatus vas;
+ VAEncSliceParameterBufferH264 *slice = &current->params;
+ int i;
+
+ {
+ memset(slice, 0, sizeof(*slice));
+
+ slice->slice_type = (current->type == FRAME_TYPE_I ? 2 :
+ current->type == FRAME_TYPE_P ? 0 : 1);
+ slice->idr_pic_id = 0;
+
+ slice->macroblock_address = 0;
+ slice->num_macroblocks = (ctx->seq_params.picture_width_in_mbs *
+ ctx->seq_params.picture_height_in_mbs);
+ slice->macroblock_info = VA_INVALID_ID;
+
+ for(i = 0; i < FF_ARRAY_ELEMS(slice->RefPicList0); i++) {
+ slice->RefPicList0[i].picture_id = VA_INVALID_SURFACE;
+ slice->RefPicList0[i].flags = VA_PICTURE_H264_INVALID;
+ }
+ for(i = 0; i < FF_ARRAY_ELEMS(slice->RefPicList1); i++) {
+ slice->RefPicList1[i].picture_id = VA_INVALID_SURFACE;
+ slice->RefPicList1[i].flags = VA_PICTURE_H264_INVALID;
+ }
+
+ if(current->refp) {
+ av_log(ctx, AV_LOG_DEBUG, "Using %#x as first reference
frame.\n",
+ current->refp->pic.picture_id);
+ slice->RefPicList0[0].picture_id =
current->refp->pic.picture_id;
+ slice->RefPicList0[0].flags =
VA_PICTURE_H264_SHORT_TERM_REFERENCE;
+ }
+ if(current->refb) {
+ av_log(ctx, AV_LOG_DEBUG, "Using %#x as second reference
frame.\n",
+ current->refb->pic.picture_id);
+ slice->RefPicList0[1].picture_id =
current->refb->pic.picture_id;
+ slice->RefPicList0[1].flags =
VA_PICTURE_H264_SHORT_TERM_REFERENCE;
+ }
+
+ slice->slice_qp_delta = 0;
+ slice->slice_alpha_c0_offset_div2 = 0;
+ slice->slice_beta_offset_div2 = 0;
+ slice->direct_spatial_mv_pred_flag = 1;
+ }
+
+ vas = vaCreateBuffer(ctx->va_instance.display,
ctx->va_codec.context_id,
+ VAEncSliceParameterBufferType,
+ sizeof(*slice), 1, slice, &current->params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for slice "
+ "parameters: %d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Slice buffer is %#x.\n",
current->params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display,
ctx->va_codec.context_id,
+ &current->params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to send slice parameters: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ return 0;
+}
+
+static int vaapi_h264_encode_picture(AVCodecContext *avctx, AVPacket *pkt,
+ const AVFrame *pic, int *got_packet)
+{
+ VAAPIH264EncodeContext *ctx = avctx->priv_data;
+ AVVAAPISurface *input, *recon;
+ VAAPIH264EncodeFrame *current;
+ AVFrame *input_image, *recon_image;
+ VACodedBufferSegment *buf_list, *buf;
+ VAStatus vas;
+ int err;
+
+ av_log(ctx, AV_LOG_DEBUG, "New frame: format %s, size %ux%u.\n",
+ av_get_pix_fmt_name(pic->format), pic->width, pic->height);
+
+ if(pic->format == AV_PIX_FMT_VAAPI) {
+ input_image = 0;
+ input = (AVVAAPISurface*)pic->buf[0]->data;
+
+ } else {
+ input_image = av_frame_alloc();
+
+ err = av_vaapi_get_input_surface(&ctx->va_codec, input_image);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface to "
+ "copy input frame: %d (%s).\n", err, av_err2str(err));
+ return -1;
+ }
+
+ input = (AVVAAPISurface*)input_image->buf[0]->data;
+
+ err = av_vaapi_map_surface(input, 0);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to map input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ return -1;
+ }
+
+ err = av_vaapi_copy_to_surface(pic, input);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to copy to input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ return -1;
+ }
+
+ err = av_vaapi_unmap_surface(input, 1);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to unmap input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ return -1;
+ }
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Using surface %#x for input image.\n",
+ input->id);
+
+ recon_image = av_frame_alloc();
+
+ err = av_vaapi_get_output_surface(&ctx->va_codec, recon_image);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface for "
+ "reconstructed frame: %d (%s).\n", err, av_err2str(err));
+ return -1;
+ }
+ recon = (AVVAAPISurface*)recon_image->buf[0]->data;
+ av_log(ctx, AV_LOG_DEBUG, "Using surface %#x for reconstructed
image.\n",
+ recon->id);
+
+ if(ctx->previous_frame != ctx->current_frame) {
+ av_frame_unref(&ctx->dpb[ctx->previous_frame].avframe);
+ }
+
+ ctx->previous_frame = ctx->current_frame;
+ ctx->current_frame = (ctx->current_frame + 1) % DPB_FRAMES;
+ {
+ current = &ctx->dpb[ctx->current_frame];
+
+ if(ctx->frame_num < 0 ||
+ ctx->frame_num == ctx->options.idr_interval)
+ current->type = FRAME_TYPE_I;
+ else
+ current->type = FRAME_TYPE_P;
+
+ if(current->type == FRAME_TYPE_I)
+ ctx->frame_num = 0;
+ else
+ ++ctx->frame_num;
+ current->frame_num = ctx->frame_num;
+
+ if(current->type == FRAME_TYPE_I) {
+ current->refp = 0;
+ current->refb = 0;
+ } else if(current->type == FRAME_TYPE_P) {
+ current->refp = &ctx->dpb[ctx->previous_frame];
+ current->refb = 0;
+ } else {
+ av_assert0(0);
+ }
+
+ memset(&current->pic, 0, sizeof(VAPictureH264));
+ current->pic.picture_id = recon->id;
+ current->pic.frame_idx = ctx->frame_num;
+
+ memcpy(&current->avframe, recon_image, sizeof(AVFrame));
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Encoding as frame as %s (%d).\n",
+ current->type == FRAME_TYPE_I ? "I" :
+ current->type == FRAME_TYPE_P ? "P" : "B", ctx->frame_num);
+
+ vas = vaBeginPicture(ctx->va_instance.display,
ctx->va_codec.context_id,
+ input->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to attach new picture: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ if(current->type == FRAME_TYPE_I) {
+ err = vaapi_h264_render_sequence(ctx);
+ if(err) return err;
+ }
+
+ err = vaapi_h264_render_picture(ctx, current);
+ if(err) return err;
+
+ if(current->type == FRAME_TYPE_I) {
+ err = vaapi_h264_render_packed_sps(ctx);
+ if(err) return err;
+
+ err = vaapi_h264_render_packed_pps(ctx);
+ if(err) return err;
+ }
+
+ err = vaapi_h264_render_packed_slice(ctx, current);
+ if(err) return err;
+
+ err = vaapi_h264_render_slice(ctx, current);
+ if(err) return err;
+
+ vas = vaEndPicture(ctx->va_instance.display, ctx->va_codec.context_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to start picture processing: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ vas = vaSyncSurface(ctx->va_instance.display, input->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to sync to picture completion: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ buf_list = 0;
+ vas = vaMapBuffer(ctx->va_instance.display, current->coded_data_id,
+ (void**)&buf_list);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to map output buffers: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ for(buf = buf_list; buf; buf = buf->next) {
+ av_log(ctx, AV_LOG_DEBUG, "Output buffer: %u bytes.\n", buf->size);
+ err = av_new_packet(pkt, buf->size);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to make output buffer "
+ "(%u bytes).\n", buf->size);
+ return err;
+ }
+
+ memcpy(pkt->data, buf->buf, buf->size);
+
+ if(current->type == FRAME_TYPE_I)
+ pkt->flags |= AV_PKT_FLAG_KEY;
+
+ *got_packet = 1;
+ }
+
+ vas = vaUnmapBuffer(ctx->va_instance.display, current->coded_data_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to unmap output buffers: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ if(pic->format != AV_PIX_FMT_VAAPI)
+ av_frame_free(&input_image);
+
+ return 0;
+}
+
+static VAConfigAttrib config_attributes[] = {
+ { .type = VAConfigAttribRTFormat,
+ .value = VA_RT_FORMAT_YUV420 },
+ { .type = VAConfigAttribRateControl,
+ .value = VA_RC_CQP },
+ { .type = VAConfigAttribEncPackedHeaders,
+ .value = 0 },
+};
+
+static av_cold int vaapi_h264_encode_init(AVCodecContext *avctx)
+{
+ VAAPIH264EncodeContext *ctx = avctx->priv_data;
+ VAStatus vas;
+ int i, err;
+
+ if(strcmp(ctx->options.profile, "constrained_baseline"))
+ ctx->va_profile = VAProfileH264ConstrainedBaseline;
+ else if(strcmp(ctx->options.profile, "baseline"))
+ ctx->va_profile = VAProfileH264Baseline;
+ else if(strcmp(ctx->options.profile, "main"))
+ ctx->va_profile = VAProfileH264Main;
+ else if(strcmp(ctx->options.profile, "high"))
+ ctx->va_profile = VAProfileH264High;
+ else {
+ av_log(ctx, AV_LOG_ERROR, "Invalid profile '%s'.\n",
+ ctx->options.profile);
+ return AVERROR(EINVAL);
+ }
+
+ ctx->level = -1;
+ if(sscanf(ctx->options.level, "%d", &ctx->level) <= 0 ||
+ ctx->level < 0 || ctx->level > 52) {
+ av_log(ctx, AV_LOG_ERROR, "Invaid level '%s'.\n",
ctx->options.level);
+ return AVERROR(EINVAL);
+ }
+
+ if(ctx->options.qp >= 0) {
+ ctx->rc_mode = VA_RC_CQP;
+ } else {
+ // Default to CQP 26.
+ ctx->rc_mode = VA_RC_CQP;
+ ctx->options.qp = 26;
+ }
+ av_log(ctx, AV_LOG_INFO, "Using constant-QP mode at %d.\n",
+ ctx->options.qp);
+
+ err = av_vaapi_instance_init(&ctx->va_instance, 0);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "No VAAPI instance.\n");
+ return err;
+ }
+
+ ctx->width = avctx->width;
+ ctx->height = avctx->height;
+
+ ctx->frame_num = -1;
+
+ {
+ AVVAAPIPipelineConfig *config = &ctx->va_config;
+
+ config->profile = ctx->va_profile;
+ config->entrypoint = VAEntrypointEncSlice;
+
+ config->attribute_count = FF_ARRAY_ELEMS(config_attributes);
+ config->attributes = config_attributes;
+ }
+
+ {
+ AVVAAPISurfaceConfig *config = &ctx->output_config;
+
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->av_format = AV_PIX_FMT_VAAPI;
+
+ config->image_format.fourcc = VA_FOURCC_NV12;
+ config->image_format.bits_per_pixel = 12;
+
+ config->count = DPB_FRAMES;
+ config->width = ctx->width;
+ config->height = ctx->height;
+
+ config->attribute_count = 0;
+ }
+
+ {
+ AVVAAPISurfaceConfig *config = &ctx->input_config;
+
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->av_format = AV_PIX_FMT_VAAPI;
+
+ config->image_format.fourcc = VA_FOURCC_NV12;
+ config->image_format.bits_per_pixel = 12;
+
+ config->count = INPUT_FRAMES;
+ config->width = ctx->width;
+ config->height = ctx->height;
+
+ config->attribute_count = 0;
+ }
+
+ err = av_vaapi_pipeline_init(&ctx->va_codec, &ctx->va_instance,
+ &ctx->va_config,
+ &ctx->input_config, &ctx->output_config);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create codec: %d (%s).\n",
+ err, av_err2str(err));
+ return err;
+ }
+
+ for(i = 0; i < DPB_FRAMES; i++) {
+ vas = vaCreateBuffer(ctx->va_instance.display,
+ ctx->va_codec.context_id,
+ VAEncCodedBufferType,
+ 1048576, 1, 0, &ctx->dpb[i].coded_data_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for "
+ "coded data: %d (%s).\n", vas, vaErrorStr(vas));
+ break;
+ }
+ av_log(ctx, AV_LOG_TRACE, "Coded data buffer %d is %#x.\n",
+ i, ctx->dpb[i].coded_data_id);
+ }
+
+ av_log(ctx, AV_LOG_INFO, "Started VAAPI H.264 encoder.\n");
+ return 0;
+}
+
+static av_cold int vaapi_h264_encode_close(AVCodecContext *avctx)
+{
+ VAAPIH264EncodeContext *ctx = avctx->priv_data;
+ int err;
+
+ err = av_vaapi_pipeline_uninit(&ctx->va_codec);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to destroy codec: %d (%s).\n",
+ err, av_err2str(err));
+ }
+
+ err = av_vaapi_instance_uninit(&ctx->va_instance);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to uninitialised VAAPI "
+ "instance: %d (%s).\n",
+ err, av_err2str(err));
+ }
+
+ return 0;
+}
+
+#define OFFSET(member) offsetof(VAAPIH264EncodeContext, options.member)
+#define FLAGS (AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_ENCODING_PARAM)
+static const AVOption vaapi_h264_options[] = {
+ { "profile", "Set H.264 profile",
+ OFFSET(profile), AV_OPT_TYPE_STRING,
+ { .str = "baseline" }, 0, 0, FLAGS },
+ { "level", "Set H.264 level",
+ OFFSET(level), AV_OPT_TYPE_STRING,
+ { .str = "52" }, 0, 0, FLAGS },
+ { "qp", "Use constant quantisation parameter",
+ OFFSET(qp), AV_OPT_TYPE_INT,
+ { .i64 = -1 }, -1, 52, FLAGS },
+ { "idr_interval", "Number of frames between IDR frames (0 = all
intra)",
+ OFFSET(idr_interval), AV_OPT_TYPE_INT,
+ { .i64 = -1 }, -1, INT_MAX, FLAGS },
+ { 0 }
+};
+
+static const AVClass vaapi_h264_class = {
+ .class_name = "VAAPI/H.264",
+ .item_name = av_default_item_name,
+ .option = vaapi_h264_options,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+AVCodec ff_h264_vaapi_encoder = {
+ .name = "vaapi_h264",
+ .long_name = NULL_IF_CONFIG_SMALL("H.264 (VAAPI)"),
+ .type = AVMEDIA_TYPE_VIDEO,
+ .id = AV_CODEC_ID_H264,
+ .priv_data_size = sizeof(VAAPIH264EncodeContext),
+ .init = &vaapi_h264_encode_init,
+ .encode2 = &vaapi_h264_encode_picture,
+ .close = &vaapi_h264_encode_close,
+ .priv_class = &vaapi_h264_class,
+ .pix_fmts = (const enum AVPixelFormat[]) {
+ AV_PIX_FMT_VAAPI,
+ AV_PIX_FMT_NV12,
+ AV_PIX_FMT_NONE,
+ },
+};
--
2.6.4
Mark Thompson
2016-01-17 17:37:13 UTC
Permalink
From 5ea0c6f4e9a0086a1972bb142e9bd9b6b779a6c1 Mon Sep 17 00:00:00 2001
From: Mark Thompson <***@jkqxz.net>
Date: Sun, 17 Jan 2016 15:59:01 +0000
Subject: [PATCH 4/5] libavcodec: add VAAPI H.265 encoder

---
configure | 1 +
libavcodec/Makefile | 1 +
libavcodec/allcodecs.c | 1 +
libavcodec/vaapi_enc_hevc.c | 1603
+++++++++++++++++++++++++++++++++++++++++++
4 files changed, 1606 insertions(+)
create mode 100644 libavcodec/vaapi_enc_hevc.c

diff --git a/configure b/configure
index a31d65e..9da8e8b 100755
--- a/configure
+++ b/configure
@@ -2519,6 +2519,7 @@ hevc_dxva2_hwaccel_select="hevc_decoder"
hevc_qsv_hwaccel_deps="libmfx"
hevc_vaapi_hwaccel_deps="vaapi VAPictureParameterBufferHEVC"
hevc_vaapi_hwaccel_select="hevc_decoder"
+hevc_vaapi_encoder_deps="vaapi"
hevc_vdpau_hwaccel_deps="vdpau VdpPictureInfoHEVC"
hevc_vdpau_hwaccel_select="hevc_decoder"
mpeg_vdpau_decoder_deps="vdpau"
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index 06b3c48..a5e1cab 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -311,6 +311,7 @@ OBJS-$(CONFIG_HEVC_DECODER) += hevc.o
hevc_mvs.o hevc_ps.o hevc_sei.o
hevcdsp.o hevc_filter.o
hevc_parse.o hevc_data.o
OBJS-$(CONFIG_HEVC_QSV_DECODER) += qsvdec_h2645.o
OBJS-$(CONFIG_HEVC_QSV_ENCODER) += qsvenc_hevc.o hevc_ps_enc.o
hevc_parse.o
+OBJS-$(CONFIG_HEVC_VAAPI_ENCODER) += vaapi_enc_hevc.o
OBJS-$(CONFIG_HNM4_VIDEO_DECODER) += hnm4video.o
OBJS-$(CONFIG_HQ_HQA_DECODER) += hq_hqa.o hq_hqadata.o
hq_hqadsp.o \
canopus.o
diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index 0d07087..a25da5b 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcodec/allcodecs.c
@@ -203,6 +203,7 @@ void avcodec_register_all(void)
REGISTER_ENCDEC (HAP, hap);
REGISTER_DECODER(HEVC, hevc);
REGISTER_DECODER(HEVC_QSV, hevc_qsv);
+ REGISTER_ENCODER(HEVC_VAAPI, hevc_vaapi);
REGISTER_DECODER(HNM4_VIDEO, hnm4_video);
REGISTER_DECODER(HQ_HQA, hq_hqa);
REGISTER_DECODER(HQX, hqx);
diff --git a/libavcodec/vaapi_enc_hevc.c b/libavcodec/vaapi_enc_hevc.c
new file mode 100644
index 0000000..06704ad
--- /dev/null
+++ b/libavcodec/vaapi_enc_hevc.c
@@ -0,1 +1,1603 @@
+/*
+ * VAAPI H.265 encoder.
+ *
+ * Copyright (C) 2016 Mark Thompson <***@jkqxz.net>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301 USA
+ */
+
+#include "avcodec.h"
+#include "golomb.h"
+#include "put_bits.h"
+
+#include "hevc.h"
+
+#include "libavutil/opt.h"
+#include "libavutil/pixdesc.h"
+#include "libavutil/vaapi.h"
+
+#define MAX_DPB_PICS 16
+#define INPUT_PICS 2
+
+#define bool unsigned char
+#define MAX_ST_REF_PIC_SETS 32
+#define MAX_LAYERS 1
+
+
+// This structure contains all possibly-useful per-sequence syntax elements
+// which are not already contained in the various VAAPI structures.
+typedef struct VAAPIHEVCEncodeMiscSequenceParams {
+
+ // Parameter set IDs.
+ unsigned int video_parameter_set_id;
+ unsigned int seq_parameter_set_id;
+
+ // Layering.
+ unsigned int vps_max_layers_minus1;
+ unsigned int vps_max_sub_layers_minus1;
+ bool vps_temporal_id_nesting_flag;
+ unsigned int vps_max_layer_id;
+ unsigned int vps_num_layer_sets_minus1;
+ unsigned int sps_max_sub_layers_minus1;
+ bool sps_temporal_id_nesting_flag;
+ bool layer_id_included_flag[MAX_LAYERS][64];
+
+ // Profile/tier/level parameters.
+ bool general_profile_compatibility_flag[32];
+ bool general_progressive_source_flag;
+ bool general_interlaced_source_flag;
+ bool general_non_packed_constraint_flag;
+ bool general_frame_only_constraint_flag;
+ bool general_inbld_flag;
+
+ // Decode/display ordering parameters.
+ unsigned int log2_max_pic_order_cnt_lsb_minus4;
+ bool vps_sub_layer_ordering_info_present_flag;
+ unsigned int vps_max_dec_pic_buffering_minus1[MAX_LAYERS];
+ unsigned int vps_max_num_reorder_pics[MAX_LAYERS];
+ unsigned int vps_max_latency_increase_plus1[MAX_LAYERS];
+ bool sps_sub_layer_ordering_info_present_flag;
+ unsigned int sps_max_dec_pic_buffering_minus1[MAX_LAYERS];
+ unsigned int sps_max_num_reorder_pics[MAX_LAYERS];
+ unsigned int sps_max_latency_increase_plus1[MAX_LAYERS];
+
+ // Timing information.
+ bool vps_timing_info_present_flag;
+ unsigned int vps_num_units_in_tick;
+ unsigned int vps_time_scale;
+ bool vps_poc_proportional_to_timing_flag;
+ unsigned int vps_num_ticks_poc_diff_minus1;
+
+ // Cropping information.
+ bool conformance_window_flag;
+ unsigned int conf_win_left_offset;
+ unsigned int conf_win_right_offset;
+ unsigned int conf_win_top_offset;
+ unsigned int conf_win_bottom_offset;
+
+ // Short-term reference picture sets.
+ unsigned int num_short_term_ref_pic_sets;
+ struct {
+ unsigned int num_negative_pics;
+ unsigned int num_positive_pics;
+
+ unsigned int delta_poc_s0_minus1[MAX_DPB_PICS];
+ bool used_by_curr_pic_s0_flag[MAX_DPB_PICS];
+
+ unsigned int delta_poc_s1_minus1[MAX_DPB_PICS];
+ bool used_by_curr_pic_s1_flag[MAX_DPB_PICS];
+ } st_ref_pic_set[MAX_ST_REF_PIC_SETS];
+
+ // Long-term reference pictures.
+ bool long_term_ref_pics_present_flag;
+ unsigned int num_long_term_ref_pics_sps;
+ struct {
+ unsigned int lt_ref_pic_poc_lsb_sps;
+ bool used_by_curr_pic_lt_sps_flag;
+ } lt_ref_pic;
+
+ // Deblocking filter control.
+ bool deblocking_filter_control_present_flag;
+ bool deblocking_filter_override_enabled_flag;
+ bool pps_deblocking_filter_disabled_flag;
+ int pps_beta_offset_div2;
+ int pps_tc_offset_div2;
+
+ // Video Usability Information.
+ bool vui_parameters_present_flag;
+ bool aspect_ratio_info_present_flag;
+ unsigned int aspect_ratio_idc;
+ unsigned int sar_width;
+ unsigned int sar_height;
+ bool video_signal_type_present_flag;
+ unsigned int video_format;
+ bool video_full_range_flag;
+ bool colour_description_present_flag;
+ unsigned int colour_primaries;
+ unsigned int transfer_characteristics;
+ unsigned int matrix_coeffs;
+
+ // Oddments.
+ bool uniform_spacing_flag;
+ bool output_flag_present_flag;
+ bool cabac_init_present_flag;
+ unsigned int num_extra_slice_header_bits;
+ bool lists_modification_present_flag;
+ bool pps_slice_chroma_qp_offsets_present_flag;
+ bool pps_slice_chroma_offset_list_enabled_flag;
+
+} VAAPIHEVCEncodeMiscSequenceParams;
+
+// This structure contains all possibly-useful per-slice syntax elements
+// which are not already contained in the various VAAPI structures.
+typedef struct {
+ // Slice segments.
+ bool first_slice_segment_in_pic_flag;
+ unsigned int slice_segment_address;
+
+ // Short-term reference picture sets.
+ bool short_term_ref_pic_set_sps_flag;
+ unsigned int short_term_ref_pic_idx;
+
+ // Deblocking filter.
+ bool deblocking_filter_override_flag;
+
+ // Oddments.
+ bool slice_reserved_flag[8];
+ bool no_output_of_prior_pics_flag;
+ bool pic_output_flag;
+
+} VAAPIHEVCEncodeMiscPictureParams;
+
+typedef struct VAAPIHEVCEncodeFrame {
+ AVFrame avframe;
+ VASurfaceID surface_id;
+
+ int poc;
+ enum {
+ FRAME_TYPE_I = I_SLICE,
+ FRAME_TYPE_P = P_SLICE,
+ FRAME_TYPE_B = B_SLICE,
+ } type;
+
+ VAPictureHEVC pic;
+
+ VAEncPictureParameterBufferHEVC pic_params;
+ VABufferID pic_params_id;
+
+ VAEncSliceParameterBufferHEVC slice_params;
+ VABufferID slice_params_id;
+
+ VAAPIHEVCEncodeMiscPictureParams misc_params;
+
+ VABufferID coded_data_id;
+
+ struct VAAPIHEVCEncodeFrame *refa, *refb;
+} VAAPIHEVCEncodeFrame;
+
+typedef struct VAAPIHEVCEncodeContext {
+ const AVClass *class;
+ const AVCodecContext *avctx;
+
+ AVVAAPIInstance va_instance;
+ AVVAAPIPipelineConfig va_config;
+ AVVAAPIPipelineContext va_codec;
+
+ int input_is_vaapi;
+ AVVAAPISurfaceConfig input_config;
+ AVVAAPISurfaceConfig output_config;
+
+ VAProfile va_profile;
+ int level;
+ int rc_mode;
+ int fixed_qp;
+
+ int input_width;
+ int input_height;
+
+ int aligned_width;
+ int aligned_height;
+ int ctu_width;
+ int ctu_height;
+
+ VAEncSequenceParameterBufferHEVC seq_params;
+ VABufferID seq_params_id;
+
+ VAEncMiscParameterRateControl rc_params;
+ VAEncMiscParameterBuffer rc_params_buffer;
+ VABufferID rc_params_id;
+
+ VAEncPictureParameterBufferHEVC pic_params;
+ VABufferID pic_params_id;
+
+ VAAPIHEVCEncodeMiscSequenceParams misc_params;
+
+ int poc;
+
+ VAAPIHEVCEncodeFrame dpb[MAX_DPB_PICS];
+ int current_frame;
+ int previous_frame;
+
+ struct {
+ const char *profile;
+ const char *level;
+ int qp;
+ int idr_interval;
+ } options;
+
+} VAAPIHEVCEncodeContext;
+
+
+// Set to 1 to log a full trace of all bitstream output (debugging only).
+#if 0
+static void trace_hevc_write_u(PutBitContext *s, unsigned int width,
+ unsigned int value, const char *name)
+{
+ av_log(0, AV_LOG_INFO, "H.265 bitstream [%3d]: %4u u(%u) / %s\n",
+ put_bits_count(s), value, width, name);
+ put_bits(s, width, value);
+}
+static void trace_hevc_write_ue(PutBitContext *s,
+ unsigned int value, const char *name)
+{
+ av_log(0, AV_LOG_INFO, "H.265 bitstream [%3d]: %4u ue(v) / %s\n",
+ put_bits_count(s), value, name);
+ set_ue_golomb(s, value);
+}
+static void trace_hevc_write_se(PutBitContext *s,
+ int value, const char *name)
+{
+ av_log(0, AV_LOG_INFO, "H.265 bitstream [%3d]: %+4d se(v) / %s\n",
+ put_bits_count(s), value, name);
+ set_se_golomb(s, value);
+}
+
+#define hevc_write_u(pbc, width, value, name) \
+ trace_hevc_write_u(pbc, width, value, #name)
+#define hevc_write_ue(pbc, value, name) \
+ trace_hevc_write_ue(pbc, value, #name)
+#define hevc_write_se(pbc, value, name) \
+ trace_hevc_write_se(pbc, value, #name)
+#else
+#define hevc_write_u(pbc, width, value, name) put_bits(pbc, width, value)
+#define hevc_write_ue(pbc, value, name) set_ue_golomb(pbc, value)
+#define hevc_write_se(pbc, value, name) set_se_golomb(pbc, value)
+#endif
+
+#define u(width, ...) hevc_write_u(s, width, __VA_ARGS__)
+#define ue(...) hevc_write_ue(s, __VA_ARGS__)
+#define se(...) hevc_write_se(s, __VA_ARGS__)
+
+#define seq_var(name) seq->name, name
+#define seq_field(name) seq->seq_fields.bits.name, name
+#define pic_var(name) pic->name, name
+#define pic_field(name) pic->pic_fields.bits.name, name
+#define slice_var(name) slice->name, name
+#define slice_field(name) slice->slice_fields.bits.name, name
+#define misc_var(name) misc->name, name
+#define miscs_var(name) miscs->name, name
+
+static void vaapi_hevc_write_nal_unit_header(PutBitContext *s,
+ int nal_unit_type)
+{
+ u(1, 0, forbidden_zero_bit);
+ u(6, nal_unit_type, nal_unit_type);
+ u(6, 0, nuh_layer_id);
+ u(3, 1, nuh_temporal_id_plus1);
+}
+
+static void vaapi_hevc_write_rbsp_trailing_bits(PutBitContext *s)
+{
+ u(1, 1, rbsp_stop_one_bit);
+ while(put_bits_count(s) & 7)
+ u(1, 0, rbsp_alignment_zero_bit);
+}
+
+static void vaapi_hevc_write_profile_tier_level(PutBitContext *s,
+ VAAPIHEVCEncodeContext
*ctx)
+{
+ VAEncSequenceParameterBufferHEVC *seq = &ctx->seq_params;
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+ int j;
+
+ if(1) {
+ u(2, 0, general_profile_space);
+ u(1, seq->general_tier_flag, general_tier_flag);
+ u(5, seq->general_profile_idc, general_profile_idc);
+
+ for(j = 0; j < 32; j++) {
+ u(1, misc_var(general_profile_compatibility_flag[j]));
+ }
+
+ u(1, misc_var(general_progressive_source_flag));
+ u(1, misc_var(general_interlaced_source_flag));
+ u(1, misc_var(general_non_packed_constraint_flag));
+ u(1, misc_var(general_frame_only_constraint_flag));
+
+ if(0) {
+ // Not main profile.
+ // Lots of extra constraint flags.
+ } else {
+ // put_bits only handles up to 31 bits.
+ u(23, 0, general_reserved_zero_43bits);
+ u(20, 0, general_reserved_zero_43bits);
+ }
+
+ if(seq->general_profile_idc >= 1 && seq->general_profile_idc <=
5) {
+ u(1, misc_var(general_inbld_flag));
+ } else {
+ u(1, 0, general_reserved_zero_bit);
+ }
+ }
+
+ u(8, seq->general_level_idc, general_level_idc);
+
+ // No sublayers.
+}
+
+static void vaapi_hevc_write_vps(PutBitContext *s,
+ VAAPIHEVCEncodeContext *ctx)
+{
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+ int i, j;
+
+ vaapi_hevc_write_nal_unit_header(s, NAL_VPS);
+
+ u(4, misc->video_parameter_set_id, vps_video_parameter_set_id);
+
+ u(1, 1, vps_base_layer_internal_flag);
+ u(1, 1, vps_base_layer_available_flag);
+ u(6, misc_var(vps_max_layers_minus1));
+ u(3, misc_var(vps_max_sub_layers_minus1));
+ u(1, misc_var(vps_temporal_id_nesting_flag));
+
+ u(16, 0xffff, vps_reserved_0xffff_16bits);
+
+ vaapi_hevc_write_profile_tier_level(s, ctx);
+
+ u(1, misc_var(vps_sub_layer_ordering_info_present_flag));
+ for(i = (misc->vps_sub_layer_ordering_info_present_flag ?
+ 0 : misc->vps_max_sub_layers_minus1);
+ i <= misc->vps_max_sub_layers_minus1; i++) {
+ ue(misc_var(vps_max_dec_pic_buffering_minus1[i]));
+ ue(misc_var(vps_max_num_reorder_pics[i]));
+ ue(misc_var(vps_max_latency_increase_plus1[i]));
+ }
+
+ u(6, misc_var(vps_max_layer_id));
+ ue(misc_var(vps_num_layer_sets_minus1));
+ for(i = 1; i <= misc->vps_num_layer_sets_minus1; i++) {
+ for(j = 0; j < misc->vps_max_layer_id; j++)
+ u(1, misc_var(layer_id_included_flag[i][j]));
+ }
+
+ u(1, misc_var(vps_timing_info_present_flag));
+ if(misc->vps_timing_info_present_flag) {
+ u(1, 0, put_bits_hack_zero_bit);
+ u(31, misc_var(vps_num_units_in_tick));
+ u(1, 0, put_bits_hack_zero_bit);
+ u(31, misc_var(vps_time_scale));
+ u(1, misc_var(vps_poc_proportional_to_timing_flag));
+ if(misc->vps_poc_proportional_to_timing_flag) {
+ ue(misc_var(vps_num_ticks_poc_diff_minus1));
+ }
+ ue(0, vps_num_hrd_parameters);
+ }
+
+ u(1, 0, vps_extension_flag);
+
+ vaapi_hevc_write_rbsp_trailing_bits(s);
+}
+
+static void vaapi_hevc_write_st_ref_pic_set(PutBitContext *s,
+ VAAPIHEVCEncodeContext *ctx,
+ int st_rps_idx)
+{
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+#define strps_var(name) misc->st_ref_pic_set[st_rps_idx].name, name
+ int i;
+
+ if(st_rps_idx != 0)
+ u(1, 0, inter_ref_pic_set_prediction_flag);
+
+ if(0) {
+ // Inter ref pic set prediction.
+ } else {
+ ue(strps_var(num_negative_pics));
+ ue(strps_var(num_positive_pics));
+
+ for(i = 0; i <
+ misc->st_ref_pic_set[st_rps_idx].num_negative_pics; i++) {
+ ue(strps_var(delta_poc_s0_minus1[i]));
+ u(1, strps_var(used_by_curr_pic_s0_flag[i]));
+ }
+ for(i = 0; i <
+ misc->st_ref_pic_set[st_rps_idx].num_positive_pics; i++) {
+ ue(strps_var(delta_poc_s1_minus1[i]));
+ u(1, strps_var(used_by_curr_pic_s1_flag[i]));
+ }
+ }
+}
+
+static void vaapi_hevc_write_vui_parameters(PutBitContext *s,
+ VAAPIHEVCEncodeContext *ctx)
+{
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+
+ u(1, misc_var(aspect_ratio_info_present_flag));
+ if(misc->aspect_ratio_info_present_flag) {
+ u(8, misc_var(aspect_ratio_idc));
+ if(misc->aspect_ratio_idc == 255) {
+ u(16, misc_var(sar_width));
+ u(16, misc_var(sar_height));
+ }
+ }
+
+ u(1, 0, overscan_info_present_flag);
+
+ u(1, misc_var(video_signal_type_present_flag));
+ if(misc->video_signal_type_present_flag) {
+ u(3, misc_var(video_format));
+ u(1, misc_var(video_full_range_flag));
+ u(1, misc_var(colour_description_present_flag));
+ if(misc->colour_description_present_flag) {
+ u(8, misc_var(colour_primaries));
+ u(8, misc_var(transfer_characteristics));
+ u(8, misc_var(matrix_coeffs));
+ }
+ }
+
+ u(1, 0, chroma_loc_info_present_flag);
+ u(1, 0, neutral_chroma_indication_flag);
+ u(1, 0, field_seq_flag);
+ u(1, 0, frame_field_info_present_flag);
+ u(1, 0, default_display_window_flag);
+ u(1, 0, vui_timing_info_present_flag);
+ u(1, 0, bitstream_restriction_flag_flag);
+}
+
+static void vaapi_hevc_write_sps(PutBitContext *s,
+ VAAPIHEVCEncodeContext *ctx)
+{
+ VAEncSequenceParameterBufferHEVC *seq = &ctx->seq_params;
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+ int i;
+
+ vaapi_hevc_write_nal_unit_header(s, NAL_SPS);
+
+ u(4, misc->video_parameter_set_id, sps_video_parameter_set_id);
+
+ u(3, misc_var(sps_max_sub_layers_minus1));
+ u(1, misc_var(sps_temporal_id_nesting_flag));
+
+ vaapi_hevc_write_profile_tier_level(s, ctx);
+
+ ue(misc->seq_parameter_set_id, sps_seq_parameter_set_id);
+ ue(seq_field(chroma_format_idc));
+ if(seq->seq_fields.bits.chroma_format_idc == 3)
+ u(1, 0, separate_colour_plane_flag);
+
+ ue(seq_var(pic_width_in_luma_samples));
+ ue(seq_var(pic_height_in_luma_samples));
+
+ u(1, misc_var(conformance_window_flag));
+ if(misc->conformance_window_flag) {
+ ue(misc_var(conf_win_left_offset));
+ ue(misc_var(conf_win_right_offset));
+ ue(misc_var(conf_win_top_offset));
+ ue(misc_var(conf_win_bottom_offset));
+ }
+
+ ue(seq_field(bit_depth_luma_minus8));
+ ue(seq_field(bit_depth_chroma_minus8));
+
+ ue(misc_var(log2_max_pic_order_cnt_lsb_minus4));
+
+ u(1, misc_var(sps_sub_layer_ordering_info_present_flag));
+ for(i = (misc->sps_sub_layer_ordering_info_present_flag ?
+ 0 : misc->sps_max_sub_layers_minus1);
+ i <= misc->sps_max_sub_layers_minus1; i++) {
+ ue(misc_var(sps_max_dec_pic_buffering_minus1[i]));
+ ue(misc_var(sps_max_num_reorder_pics[i]));
+ ue(misc_var(sps_max_latency_increase_plus1[i]));
+ }
+
+ ue(seq_var(log2_min_luma_coding_block_size_minus3));
+ ue(seq_var(log2_diff_max_min_luma_coding_block_size));
+ ue(seq_var(log2_min_transform_block_size_minus2));
+ ue(seq_var(log2_diff_max_min_transform_block_size));
+ ue(seq_var(max_transform_hierarchy_depth_inter));
+ ue(seq_var(max_transform_hierarchy_depth_intra));
+
+ u(1, seq_field(scaling_list_enabled_flag));
+ if(seq->seq_fields.bits.scaling_list_enabled_flag) {
+ u(1, 0, sps_scaling_list_data_present_flag);
+ }
+
+ u(1, seq_field(amp_enabled_flag));
+ u(1, seq_field(sample_adaptive_offset_enabled_flag));
+
+ u(1, seq_field(pcm_enabled_flag));
+ if(seq->seq_fields.bits.pcm_enabled_flag) {
+ u(4, seq_var(pcm_sample_bit_depth_luma_minus1));
+ u(4, seq_var(pcm_sample_bit_depth_chroma_minus1));
+ ue(seq_var(log2_min_pcm_luma_coding_block_size_minus3));
+ ue(seq->log2_max_pcm_luma_coding_block_size_minus3 -
+ seq->log2_min_pcm_luma_coding_block_size_minus3,
+ log2_diff_max_min_pcm_luma_coding_block_size);
+ u(1, seq_field(pcm_loop_filter_disabled_flag));
+ }
+
+ ue(misc_var(num_short_term_ref_pic_sets));
+ for(i = 0; i < misc->num_short_term_ref_pic_sets; i++)
+ vaapi_hevc_write_st_ref_pic_set(s, ctx, i);
+
+ u(1, misc_var(long_term_ref_pics_present_flag));
+ if(misc->long_term_ref_pics_present_flag) {
+ ue(0, num_long_term_ref_pics_sps);
+ }
+
+ u(1, seq_field(sps_temporal_mvp_enabled_flag));
+ u(1, seq_field(strong_intra_smoothing_enabled_flag));
+
+ u(1, misc_var(vui_parameters_present_flag));
+ if(misc->vui_parameters_present_flag) {
+ vaapi_hevc_write_vui_parameters(s, ctx);
+ }
+
+ u(1, 0, sps_extension_present_flag);
+
+ vaapi_hevc_write_rbsp_trailing_bits(s);
+}
+
+static void vaapi_hevc_write_pps(PutBitContext *s,
+ VAAPIHEVCEncodeContext *ctx)
+{
+ VAEncPictureParameterBufferHEVC *pic = &ctx->pic_params;
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+ int i;
+
+ vaapi_hevc_write_nal_unit_header(s, NAL_PPS);
+
+ ue(pic->slice_pic_parameter_set_id, pps_pic_parameter_set_id);
+ ue(misc->seq_parameter_set_id, pps_seq_parameter_set_id);
+
+ u(1, pic_field(dependent_slice_segments_enabled_flag));
+ u(1, misc_var(output_flag_present_flag));
+ u(3, misc_var(num_extra_slice_header_bits));
+ u(1, pic_field(sign_data_hiding_enabled_flag));
+ u(1, misc_var(cabac_init_present_flag));
+
+ ue(pic_var(num_ref_idx_l0_default_active_minus1));
+ ue(pic_var(num_ref_idx_l1_default_active_minus1));
+
+ se(pic->pic_init_qp - 26, init_qp_minus26);
+
+ u(1, pic_field(constrained_intra_pred_flag));
+ u(1, pic_field(transform_skip_enabled_flag));
+
+ u(1, pic_field(cu_qp_delta_enabled_flag));
+ if(pic->pic_fields.bits.cu_qp_delta_enabled_flag)
+ ue(pic_var(diff_cu_qp_delta_depth));
+
+ se(pic_var(pps_cb_qp_offset));
+ se(pic_var(pps_cr_qp_offset));
+
+ u(1, misc_var(pps_slice_chroma_qp_offsets_present_flag));
+ u(1, pic_field(weighted_pred_flag));
+ u(1, pic_field(weighted_bipred_flag));
+ u(1, pic_field(transquant_bypass_enabled_flag));
+ u(1, pic_field(tiles_enabled_flag));
+ u(1, pic_field(entropy_coding_sync_enabled_flag));
+
+ if(pic->pic_fields.bits.tiles_enabled_flag) {
+ ue(pic_var(num_tile_columns_minus1));
+ ue(pic_var(num_tile_rows_minus1));
+ u(1, misc_var(uniform_spacing_flag));
+ if(!misc->uniform_spacing_flag) {
+ for(i = 0; i < pic->num_tile_columns_minus1; i++)
+ ue(pic_var(column_width_minus1[i]));
+ for(i = 0; i < pic->num_tile_rows_minus1; i++)
+ ue(pic_var(row_height_minus1[i]));
+ }
+ u(1, pic_field(loop_filter_across_tiles_enabled_flag));
+ }
+
+ u(1, pic_field(pps_loop_filter_across_slices_enabled_flag));
+ u(1, misc_var(deblocking_filter_control_present_flag));
+ if(misc->deblocking_filter_control_present_flag) {
+ u(1, misc_var(deblocking_filter_override_enabled_flag));
+ u(1, misc_var(pps_deblocking_filter_disabled_flag));
+ if(!misc->pps_deblocking_filter_disabled_flag) {
+ se(misc_var(pps_beta_offset_div2));
+ se(misc_var(pps_tc_offset_div2));
+ }
+ }
+
+ u(1, 0, pps_scaling_list_data_present_flag);
+ // No scaling list data.
+
+ u(1, misc_var(lists_modification_present_flag));
+ ue(pic_var(log2_parallel_merge_level_minus2));
+ u(1, 0, slice_segment_header_extension_present_flag);
+ u(1, 0, pps_extension_present_flag);
+
+ vaapi_hevc_write_rbsp_trailing_bits(s);
+}
+
+static void vaapi_hevc_write_slice_header(PutBitContext *s,
+ VAAPIHEVCEncodeContext *ctx,
+ VAAPIHEVCEncodeFrame *current)
+{
+ VAEncSequenceParameterBufferHEVC *seq = &ctx->seq_params;
+ VAEncPictureParameterBufferHEVC *pic = &current->pic_params;
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+ VAEncSliceParameterBufferHEVC *slice = &current->slice_params;
+ VAAPIHEVCEncodeMiscPictureParams *miscs = &current->misc_params;
+ int i;
+
+ vaapi_hevc_write_nal_unit_header(s, pic->nal_unit_type);
+
+ u(1, miscs_var(first_slice_segment_in_pic_flag));
+ if(pic->nal_unit_type >= NAL_BLA_W_LP &&
+ pic->nal_unit_type <= 23)
+ u(1, miscs_var(no_output_of_prior_pics_flag));
+
+ ue(slice_var(slice_pic_parameter_set_id));
+
+ if(!miscs->first_slice_segment_in_pic_flag) {
+ if(pic->pic_fields.bits.dependent_slice_segments_enabled_flag)
+ u(1, slice_field(dependent_slice_segment_flag));
+ u(av_log2((ctx->ctu_width * ctx->ctu_height) - 1) + 1,
+ miscs_var(slice_segment_address));
+ }
+ if(!slice->slice_fields.bits.dependent_slice_segment_flag) {
+ for(i = 0; i < misc->num_extra_slice_header_bits; i++)
+ u(1, miscs_var(slice_reserved_flag[i]));
+
+ ue(slice_var(slice_type));
+ if(misc->output_flag_present_flag)
+ u(1, 1, pic_output_flag);
+ if(seq->seq_fields.bits.separate_colour_plane_flag)
+ u(2, slice_field(colour_plane_id));
+ if(pic->nal_unit_type != NAL_IDR_W_RADL &&
+ pic->nal_unit_type != NAL_IDR_N_LP) {
+ u(4 + misc->log2_max_pic_order_cnt_lsb_minus4,
+ current->poc & ((1 <<
(misc->log2_max_pic_order_cnt_lsb_minus4 + 4)) - 1),
+ slice_pic_order_cnt_lsb);
+
+ u(1, miscs_var(short_term_ref_pic_set_sps_flag));
+ if(!miscs->short_term_ref_pic_set_sps_flag) {
+ av_assert0(0);
+ //
vaapi_hevc_write_st_ref_pic_set(ctx->num_short_term_ref_pic_sets);
+ } else if(misc->num_short_term_ref_pic_sets > 1) {
+ u(av_log2(misc->num_short_term_ref_pic_sets - 1) + 1,
+ miscs_var(short_term_ref_pic_idx));
+ }
+
+ if(misc->long_term_ref_pics_present_flag) {
+ av_assert0(0);
+ }
+
+ if(seq->seq_fields.bits.sps_temporal_mvp_enabled_flag) {
+ u(1, slice_field(slice_temporal_mvp_enabled_flag));
+ }
+
+ if(seq->seq_fields.bits.sample_adaptive_offset_enabled_flag) {
+ u(1, slice_field(slice_sao_luma_flag));
+ if(!seq->seq_fields.bits.separate_colour_plane_flag &&
+ seq->seq_fields.bits.chroma_format_idc != 0) {
+ u(1, slice_field(slice_sao_chroma_flag));
+ }
+ }
+
+ if(slice->slice_type == P_SLICE || slice->slice_type ==
B_SLICE) {
+ u(1, slice_field(num_ref_idx_active_override_flag));
+
if(slice->slice_fields.bits.num_ref_idx_active_override_flag) {
+ ue(slice_var(num_ref_idx_l0_active_minus1));
+ if(slice->slice_type == B_SLICE) {
+ ue(slice_var(num_ref_idx_l1_active_minus1));
+ }
+ }
+
+ if(misc->lists_modification_present_flag) {
+ av_assert0(0);
+ // ref_pic_lists_modification()
+ }
+ if(slice->slice_type == B_SLICE) {
+ u(1, slice_field(mvd_l1_zero_flag));
+ }
+ if(misc->cabac_init_present_flag) {
+ u(1, slice_field(cabac_init_flag));
+ }
+
if(slice->slice_fields.bits.slice_temporal_mvp_enabled_flag) {
+ if(slice->slice_type == B_SLICE)
+ u(1, slice_field(collocated_from_l0_flag));
+ ue(pic->collocated_ref_pic_index, collocated_ref_idx);
+ }
+ if((pic->pic_fields.bits.weighted_pred_flag &&
+ slice->slice_type == P_SLICE) ||
+ (pic->pic_fields.bits.weighted_bipred_flag &&
+ slice->slice_type == B_SLICE)) {
+ ue(5 - slice->max_num_merge_cand,
five_minus_max_num_merge_cand);
+ }
+ }
+
+ se(slice_var(slice_qp_delta));
+ if(misc->pps_slice_chroma_qp_offsets_present_flag) {
+ se(slice_var(slice_cb_qp_offset));
+ se(slice_var(slice_cr_qp_offset));
+ }
+ if(misc->pps_slice_chroma_offset_list_enabled_flag) {
+ u(1, 0, cu_chroma_qp_offset_enabled_flag);
+ }
+ if(misc->deblocking_filter_override_enabled_flag) {
+ u(1, miscs_var(deblocking_filter_override_flag));
+ }
+ if(miscs->deblocking_filter_override_flag) {
+ u(1, slice_field(slice_deblocking_filter_disabled_flag));
+
if(!slice->slice_fields.bits.slice_deblocking_filter_disabled_flag) {
+ se(slice_var(slice_beta_offset_div2));
+ se(slice_var(slice_tc_offset_div2));
+ }
+ }
+
if(pic->pic_fields.bits.pps_loop_filter_across_slices_enabled_flag &&
+ (slice->slice_fields.bits.slice_sao_luma_flag ||
+ slice->slice_fields.bits.slice_sao_chroma_flag ||
+
slice->slice_fields.bits.slice_deblocking_filter_disabled_flag)) {
+ u(1,
slice_field(slice_loop_filter_across_slices_enabled_flag));
+ }
+ }
+
+ if(pic->pic_fields.bits.tiles_enabled_flag ||
+ pic->pic_fields.bits.entropy_coding_sync_enabled_flag) {
+ // num_entry_point_offsets
+ }
+
+ if(0) {
+ // slice_segment_header_extension_length
+ }
+ }
+
+ u(1, 1, alignment_bit_equal_to_one);
+ while(put_bits_count(s) & 7)
+ u(1, 0, alignment_bit_equal_to_zero);
+}
+
+static size_t vaapi_hevc_nal_unit_to_byte_stream(uint8_t *dst, uint8_t
*src, size_t len)
+{
+ size_t dp, sp;
+ int zero_run = 0;
+
+ // Start code.
+ dst[0] = dst[1] = dst[2] = 0;
+ dst[3] = 1;
+ dp = 4;
+
+ for(sp = 0; sp < len; sp++) {
+ if(zero_run < 2) {
+ if(src[sp] == 0)
+ ++zero_run;
+ else
+ zero_run = 0;
+ } else {
+ if((src[sp] & ~3) == 0) {
+ // emulation_prevention_three_byte
+ dst[dp++] = 3;
+ }
+ zero_run = src[sp] == 0;
+ }
+ dst[dp++] = src[sp];
+ }
+
+ return dp;
+}
+
+static int vaapi_hevc_render_packed_header(VAAPIHEVCEncodeContext *ctx,
int type,
+ char *data, size_t bit_len)
+{
+ VAStatus vas;
+ VABufferID id_list[2];
+ VAEncPackedHeaderParameterBuffer buffer = {
+ .type = type,
+ .bit_length = bit_len,
+ .has_emulation_bytes = 1,
+ };
+
+ vas = vaCreateBuffer(ctx->va_instance.display,
ctx->va_codec.context_id,
+ VAEncPackedHeaderParameterBufferType,
+ sizeof(&buffer), 1, &buffer, &id_list[0]);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create parameter buffer
for packed "
+ "header (type %d): %d (%s).\n", type, vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ vas = vaCreateBuffer(ctx->va_instance.display,
ctx->va_codec.context_id,
+ VAEncPackedHeaderDataBufferType,
+ (bit_len + 7) / 8, 1, data, &id_list[1]);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create data buffer for
packed "
+ "header (type %d): %d (%s).\n", type, vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ av_log(ctx, AV_LOG_DEBUG, "Packed header buffer (%d) is %#x/%#x "
+ "(%zu bits).\n", type, id_list[0], id_list[1], bit_len);
+
+ vas = vaRenderPicture(ctx->va_instance.display,
ctx->va_codec.context_id,
+ id_list, 2);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to render packed "
+ "header (type %d): %d (%s).\n", type, vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ return 0;
+}
+
+static int vaapi_hevc_render_packed_vps_sps(VAAPIHEVCEncodeContext *ctx)
+{
+ PutBitContext pbc, *s = &pbc;
+ uint8_t tmp[256];
+ uint8_t buf[512];
+ size_t byte_len, nal_len;
+
+ init_put_bits(s, tmp, sizeof(tmp));
+ vaapi_hevc_write_vps(s, ctx);
+ nal_len = put_bits_count(s);
+ flush_put_bits(s);
+ byte_len = vaapi_hevc_nal_unit_to_byte_stream(buf, tmp, nal_len / 8);
+
+ init_put_bits(s, tmp, sizeof(tmp));
+ vaapi_hevc_write_sps(s, ctx);
+ nal_len = put_bits_count(s);
+ flush_put_bits(s);
+ byte_len += vaapi_hevc_nal_unit_to_byte_stream(buf + byte_len, tmp,
nal_len / 8);
+
+ return vaapi_hevc_render_packed_header(ctx, VAEncPackedHeaderSequence,
+ buf, byte_len * 8);
+}
+
+static int vaapi_hevc_render_packed_pps(VAAPIHEVCEncodeContext *ctx)
+{
+ PutBitContext pbc, *s = &pbc;
+ uint8_t tmp[256];
+ uint8_t buf[512];
+ size_t byte_len, nal_len;
+
+ init_put_bits(s, tmp, sizeof(tmp));
+ vaapi_hevc_write_pps(s, ctx);
+ nal_len = put_bits_count(s);
+ flush_put_bits(s);
+ byte_len = vaapi_hevc_nal_unit_to_byte_stream(buf, tmp, nal_len / 8);
+
+ return vaapi_hevc_render_packed_header(ctx, VAEncPackedHeaderPicture,
+ buf, byte_len * 8);
+}
+
+static int vaapi_hevc_render_packed_slice(VAAPIHEVCEncodeContext *ctx,
+ VAAPIHEVCEncodeFrame *current)
+{
+ PutBitContext pbc, *s = &pbc;
+ uint8_t tmp[256];
+ uint8_t buf[512];
+ size_t byte_len, nal_len;
+
+ init_put_bits(s, tmp, sizeof(tmp));
+ vaapi_hevc_write_slice_header(s, ctx, current);
+ nal_len = put_bits_count(s);
+ flush_put_bits(s);
+ byte_len = vaapi_hevc_nal_unit_to_byte_stream(buf, tmp, nal_len / 8);
+
+ return vaapi_hevc_render_packed_header(ctx, VAEncPackedHeaderSlice,
+ buf, byte_len * 8);
+}
+
+static int vaapi_hevc_render_sequence(VAAPIHEVCEncodeContext *ctx)
+{
+ VAStatus vas;
+ VAEncSequenceParameterBufferHEVC *seq = &ctx->seq_params;
+
+ vas = vaCreateBuffer(ctx->va_instance.display,
ctx->va_codec.context_id,
+ VAEncSequenceParameterBufferType,
+ sizeof(*seq), 1, seq, &ctx->seq_params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for sequence "
+ "parameters: %d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Sequence parameter buffer is %#x.\n",
+ ctx->seq_params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display,
ctx->va_codec.context_id,
+ &ctx->seq_params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to send sequence parameters: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ return 0;
+}
+
+static int vaapi_hevc_render_picture(VAAPIHEVCEncodeContext *ctx,
+ VAAPIHEVCEncodeFrame *current)
+{
+ VAStatus vas;
+ VAEncPictureParameterBufferHEVC *pic = &current->pic_params;
+
+ vas = vaCreateBuffer(ctx->va_instance.display,
ctx->va_codec.context_id,
+ VAEncPictureParameterBufferType,
+ sizeof(*pic), 1, pic, &ctx->pic_params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for picture "
+ "parameters: %d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Picture parameter buffer is %#x.\n",
+ ctx->pic_params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display,
ctx->va_codec.context_id,
+ &ctx->pic_params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to send picture parameters: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ return 0;
+}
+
+static int vaapi_hevc_render_slice(VAAPIHEVCEncodeContext *ctx,
+ VAAPIHEVCEncodeFrame *current)
+{
+ VAStatus vas;
+ VAEncSliceParameterBufferHEVC *slice = &current->slice_params;
+
+ vas = vaCreateBuffer(ctx->va_instance.display,
ctx->va_codec.context_id,
+ VAEncSliceParameterBufferType,
+ sizeof(*slice), 1, slice,
&current->slice_params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for slice "
+ "parameters: %d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Slice buffer is %#x.\n",
current->slice_params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display,
ctx->va_codec.context_id,
+ &current->slice_params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to send slice parameters: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ return 0;
+}
+
+static av_cold int vaapi_hevc_encode_init_stream(VAAPIHEVCEncodeContext
*ctx)
+{
+ VAEncSequenceParameterBufferHEVC *seq = &ctx->seq_params;
+ VAEncPictureParameterBufferHEVC *pic = &ctx->pic_params;
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+ int i;
+
+ memset(seq, 0, sizeof(*seq));
+ memset(pic, 0, sizeof(*pic));
+
+ {
+ // general_profile_space == 0.
+ seq->general_profile_idc = 1; // Main profile.
+ seq->general_tier_flag = 0;
+
+ seq->general_level_idc = ctx->level * 3;
+
+ seq->intra_period = 0;
+ seq->intra_idr_period = 0;
+ seq->ip_period = 0;
+
+ seq->pic_width_in_luma_samples = ctx->aligned_width;
+ seq->pic_height_in_luma_samples = ctx->aligned_height;
+
+ seq->seq_fields.bits.chroma_format_idc = 1; // 4:2:0.
+ seq->seq_fields.bits.separate_colour_plane_flag = 0;
+ seq->seq_fields.bits.bit_depth_luma_minus8 = 0; // 8-bit luma.
+ seq->seq_fields.bits.bit_depth_chroma_minus8 = 0; // 8-bit chroma.
+ // Other misc flags all zero.
+
+ // These have to come from the capabilities of the encoder. We
have
+ // no way to query it, so just hardcode ones which worked for me...
+ // CTB size from 8x8 to 32x32.
+ seq->log2_min_luma_coding_block_size_minus3 = 0;
+ seq->log2_diff_max_min_luma_coding_block_size = 2;
+ // Transform size from 4x4 to 32x32.
+ seq->log2_min_transform_block_size_minus2 = 0;
+ seq->log2_diff_max_min_transform_block_size = 3;
+ // Full transform hierarchy allowed (2-5).
+ seq->max_transform_hierarchy_depth_inter = 3;
+ seq->max_transform_hierarchy_depth_intra = 3;
+
+ seq->vui_parameters_present_flag = 0;
+ }
+
+ {
+ for(i = 0; i < FF_ARRAY_ELEMS(pic->reference_frames); i++) {
+ pic->reference_frames[i].picture_id = VA_INVALID_ID;
+ pic->reference_frames[i].flags = VA_PICTURE_HEVC_INVALID;
+ }
+
+ pic->collocated_ref_pic_index = 0xff;
+
+ pic->last_picture = 0;
+
+ pic->pic_init_qp = ctx->fixed_qp;
+
+ pic->diff_cu_qp_delta_depth = 0;
+ pic->pps_cb_qp_offset = 0;
+ pic->pps_cr_qp_offset = 0;
+
+ // tiles_enabled_flag == 0, so ignore
num_tile_(rows|columns)_minus1.
+
+ pic->log2_parallel_merge_level_minus2 = 0;
+
+ // No limit on size.
+ pic->ctu_max_bitsize_allowed = 0;
+
+ pic->num_ref_idx_l0_default_active_minus1 = 0;
+ pic->num_ref_idx_l1_default_active_minus1 = 0;
+
+ pic->slice_pic_parameter_set_id = 0;
+
+ pic->pic_fields.bits.screen_content_flag = 0;
+ pic->pic_fields.bits.enable_gpu_weighted_prediction = 0;
+
+ //pic->pic_fields.bits.cu_qp_delta_enabled_flag = 1;
+ }
+
+ {
+ misc->video_parameter_set_id = 5;
+ misc->seq_parameter_set_id = 5;
+
+ misc->vps_max_layers_minus1 = 0;
+ misc->vps_max_sub_layers_minus1 = 0;
+ misc->vps_temporal_id_nesting_flag = 1;
+ misc->sps_max_sub_layers_minus1 = 0;
+ misc->sps_temporal_id_nesting_flag = 1;
+
+ for(i = 0; i < 32; i++) {
+ misc->general_profile_compatibility_flag[i] =
+ (i == seq->general_profile_idc);
+ }
+
+ misc->general_progressive_source_flag = 1;
+ misc->general_interlaced_source_flag = 0;
+ misc->general_non_packed_constraint_flag = 0;
+ misc->general_frame_only_constraint_flag = 1;
+ misc->general_inbld_flag = 0;
+
+ misc->log2_max_pic_order_cnt_lsb_minus4 = 4;
+ misc->vps_sub_layer_ordering_info_present_flag = 0;
+ misc->vps_max_dec_pic_buffering_minus1[0] = 0;
+ misc->vps_max_num_reorder_pics[0] = 0;
+ misc->vps_max_latency_increase_plus1[0] = 0;
+ misc->sps_sub_layer_ordering_info_present_flag = 0;
+ misc->sps_max_dec_pic_buffering_minus1[0] = 0;
+ misc->sps_max_num_reorder_pics[0] = 0;
+ misc->sps_max_latency_increase_plus1[0] = 0;
+
+ misc->vps_timing_info_present_flag = 1;
+ misc->vps_num_units_in_tick = ctx->avctx->time_base.num;
+ misc->vps_time_scale = ctx->avctx->time_base.den;
+ misc->vps_poc_proportional_to_timing_flag = 1;
+ misc->vps_num_ticks_poc_diff_minus1 = 0;
+
+ if(ctx->input_width != ctx->aligned_width ||
+ ctx->input_height != ctx->aligned_height) {
+ misc->conformance_window_flag = 1;
+ misc->conf_win_left_offset = 0;
+ misc->conf_win_right_offset =
+ (ctx->aligned_width - ctx->input_width) / 2;
+ misc->conf_win_top_offset = 0;
+ misc->conf_win_bottom_offset =
+ (ctx->aligned_height - ctx->input_height) / 2;
+ } else {
+ misc->conformance_window_flag = 0;
+ }
+
+ misc->num_short_term_ref_pic_sets = 1;
+ misc->st_ref_pic_set[0].num_negative_pics = 1;
+ misc->st_ref_pic_set[0].num_positive_pics = 0;
+ misc->st_ref_pic_set[0].delta_poc_s0_minus1[0] = 0;
+ misc->st_ref_pic_set[0].used_by_curr_pic_s0_flag[0] = 1;
+
+ misc->vui_parameters_present_flag = 1;
+ if(ctx->avctx->sample_aspect_ratio.num != 0) {
+ misc->aspect_ratio_info_present_flag = 1;
+ if(ctx->avctx->sample_aspect_ratio.num ==
+ ctx->avctx->sample_aspect_ratio.den) {
+ misc->aspect_ratio_idc = 1;
+ } else {
+ misc->aspect_ratio_idc = 255; // Extended SAR.
+ misc->sar_width = ctx->avctx->sample_aspect_ratio.num;
+ misc->sar_height = ctx->avctx->sample_aspect_ratio.den;
+ }
+ }
+ if(1) {
+ // Should this be conditional on some of these being set?
+ misc->video_signal_type_present_flag = 1;
+ misc->video_format = 5; // Unspecified.
+ misc->video_full_range_flag = 0;
+ misc->colour_description_present_flag = 1;
+ misc->colour_primaries = ctx->avctx->color_primaries;
+ misc->transfer_characteristics = ctx->avctx->color_trc;
+ misc->matrix_coeffs = ctx->avctx->colorspace;
+ }
+ }
+
+ return 0;
+}
+
+static int vaapi_hevc_encode_init_picture(VAAPIHEVCEncodeContext *ctx,
+ VAAPIHEVCEncodeFrame *current)
+{
+ VAEncPictureParameterBufferHEVC *pic = &current->pic_params;
+ VAEncSliceParameterBufferHEVC *slice = &current->slice_params;
+ VAAPIHEVCEncodeMiscPictureParams *misc = &current->misc_params;
+ int idr = current->type == FRAME_TYPE_I;
+
+ memcpy(pic, &ctx->pic_params, sizeof(*pic));
+ memset(slice, 0, sizeof(*slice));
+ memset(misc, 0, sizeof(*misc));
+
+ {
+ memcpy(&pic->decoded_curr_pic, &current->pic,
sizeof(VAPictureHEVC));
+
+ if(current->type != FRAME_TYPE_I) {
+ memcpy(&pic->reference_frames[0],
+ &current->refa->pic, sizeof(VAPictureHEVC));
+ }
+ if(current->type == FRAME_TYPE_B) {
+ memcpy(&pic->reference_frames[1],
+ &current->refb->pic, sizeof(VAPictureHEVC));
+ }
+
+ pic->coded_buf = current->coded_data_id;
+
+ pic->nal_unit_type = (idr ? NAL_IDR_W_RADL : NAL_TRAIL_R);
+
+ pic->pic_fields.bits.idr_pic_flag = (idr ? 1 : 0);
+ pic->pic_fields.bits.coding_type = (idr ? 1 : 2);
+
+ pic->pic_fields.bits.reference_pic_flag = 1;
+ }
+
+ {
+ slice->slice_segment_address = 0;
+ slice->num_ctu_in_slice = ctx->ctu_width * ctx->ctu_height;
+
+ slice->slice_type = current->type;
+ slice->slice_pic_parameter_set_id = 0;
+
+ slice->num_ref_idx_l0_active_minus1 = 0;
+ slice->num_ref_idx_l1_active_minus1 = 0;
+ memcpy(slice->ref_pic_list0, pic->reference_frames,
sizeof(pic->reference_frames));
+ memcpy(slice->ref_pic_list1, pic->reference_frames,
sizeof(pic->reference_frames));
+
+ slice->max_num_merge_cand = 5;
+ slice->slice_qp_delta = 0;
+
+ slice->slice_fields.bits.last_slice_of_pic_flag = 1;
+ }
+
+ {
+ misc->first_slice_segment_in_pic_flag = 1;
+
+ misc->short_term_ref_pic_set_sps_flag = 1;
+ misc->short_term_ref_pic_idx = 0;
+ }
+
+ return 0;
+}
+
+static int vaapi_hevc_encode_picture(AVCodecContext *avctx, AVPacket *pkt,
+ const AVFrame *pic, int *got_packet)
+{
+ VAAPIHEVCEncodeContext *ctx = avctx->priv_data;
+ AVVAAPISurface *input, *recon;
+ VAAPIHEVCEncodeFrame *current;
+ AVFrame *input_image, *recon_image;
+ VACodedBufferSegment *buf_list, *buf;
+ VAStatus vas;
+ int err;
+
+ av_log(ctx, AV_LOG_DEBUG, "New frame: format %s, size %ux%u.\n",
+ av_get_pix_fmt_name(pic->format), pic->width, pic->height);
+
+ if(pic->format == AV_PIX_FMT_VAAPI) {
+ input_image = 0;
+ input = (AVVAAPISurface*)pic->buf[0]->data;
+
+ } else {
+ input_image = av_frame_alloc();
+
+ err = av_vaapi_get_input_surface(&ctx->va_codec, input_image);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface to "
+ "copy input frame: %d (%s).\n", err, av_err2str(err));
+ return err;
+ }
+
+ input = (AVVAAPISurface*)input_image->buf[0]->data;
+
+ err = av_vaapi_map_surface(input, 0);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to map input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ return err;
+ }
+
+ err = av_vaapi_copy_to_surface(pic, input);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to copy to input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ return err;
+ }
+
+ err = av_vaapi_unmap_surface(input, 1);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to unmap input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ return err;
+ }
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Using surface %#x for input image.\n",
+ input->id);
+
+ recon_image = av_frame_alloc();
+
+ err = av_vaapi_get_output_surface(&ctx->va_codec, recon_image);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface for "
+ "reconstructed frame: %d (%s).\n", err, av_err2str(err));
+ return err;
+ }
+ recon = (AVVAAPISurface*)recon_image->buf[0]->data;
+ av_log(ctx, AV_LOG_DEBUG, "Using surface %#x for reconstructed
image.\n",
+ recon->id);
+
+ if(ctx->previous_frame != ctx->current_frame) {
+ av_frame_unref(&ctx->dpb[ctx->previous_frame].avframe);
+ }
+
+ ctx->previous_frame = ctx->current_frame;
+ ctx->current_frame = (ctx->current_frame + 1) % MAX_DPB_PICS;
+ {
+ current = &ctx->dpb[ctx->current_frame];
+
+ if(ctx->poc < 0 ||
+ ctx->poc == ctx->options.idr_interval)
+ current->type = FRAME_TYPE_I;
+ else
+ current->type = FRAME_TYPE_P;
+
+ if(current->type == FRAME_TYPE_I)
+ ctx->poc = 0;
+ else
+ ++ctx->poc;
+ current->poc = ctx->poc;
+
+ if(current->type == FRAME_TYPE_I) {
+ current->refa = 0;
+ current->refb = 0;
+ } else if(current->type == FRAME_TYPE_P) {
+ current->refa = &ctx->dpb[ctx->previous_frame];
+ current->refb = 0;
+ } else {
+ av_assert0(0);
+ }
+
+ memset(&current->pic, 0, sizeof(VAPictureHEVC));
+ current->pic.picture_id = recon->id;
+ current->pic.pic_order_cnt = current->poc;
+
+ memcpy(&current->avframe, recon_image, sizeof(AVFrame));
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Encoding as frame as %s (%d).\n",
+ current->type == FRAME_TYPE_I ? "I" :
+ current->type == FRAME_TYPE_P ? "P" : "B", current->poc);
+
+ vas = vaBeginPicture(ctx->va_instance.display,
ctx->va_codec.context_id,
+ input->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to attach new picture: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ vaapi_hevc_encode_init_picture(ctx, current);
+
+ if(current->type == FRAME_TYPE_I) {
+ err = vaapi_hevc_render_sequence(ctx);
+ if(err) return err;
+ }
+
+ err = vaapi_hevc_render_picture(ctx, current);
+ if(err) return err;
+
+ if(current->type == FRAME_TYPE_I) {
+ err = vaapi_hevc_render_packed_vps_sps(ctx);
+ if(err) return err;
+
+ err = vaapi_hevc_render_packed_pps(ctx);
+ if(err) return err;
+ }
+
+ err = vaapi_hevc_render_packed_slice(ctx, current);
+ if(err) return err;
+
+ err = vaapi_hevc_render_slice(ctx, current);
+ if(err) return err;
+
+ vas = vaEndPicture(ctx->va_instance.display, ctx->va_codec.context_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to start picture processing: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ vas = vaSyncSurface(ctx->va_instance.display, input->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to sync to picture completion: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ buf_list = 0;
+ vas = vaMapBuffer(ctx->va_instance.display, current->coded_data_id,
+ (void**)&buf_list);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to map output buffers: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ for(buf = buf_list; buf; buf = buf->next) {
+ av_log(ctx, AV_LOG_DEBUG, "Output buffer: %u bytes.\n", buf->size);
+
+ err = av_new_packet(pkt, buf->size);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to make output buffer "
+ "(%u bytes).\n", buf->size);
+ return err;
+ }
+
+ memcpy(pkt->data, buf->buf, buf->size);
+
+ if(current->type == FRAME_TYPE_I)
+ pkt->flags |= AV_PKT_FLAG_KEY;
+
+ pkt->pts = pic->pts;
+
+ *got_packet = 1;
+ }
+
+ vas = vaUnmapBuffer(ctx->va_instance.display, current->coded_data_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to unmap output buffers: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ if(pic->format != AV_PIX_FMT_VAAPI)
+ av_frame_free(&input_image);
+
+ return 0;
+}
+
+static VAConfigAttrib config_attributes[] = {
+ { .type = VAConfigAttribRTFormat,
+ .value = VA_RT_FORMAT_YUV420 },
+ { .type = VAConfigAttribRateControl,
+ .value = VA_RC_CQP },
+ { .type = VAConfigAttribEncPackedHeaders,
+ .value = 0 },
+};
+
+static av_cold int vaapi_hevc_encode_init(AVCodecContext *avctx)
+{
+ VAAPIHEVCEncodeContext *ctx = avctx->priv_data;
+ VAStatus vas;
+ int i, err;
+
+ ctx->avctx = avctx;
+
+ ctx->va_profile = VAProfileHEVCMain;
+ ctx->level = -1;
+ if(sscanf(ctx->options.level, "%d", &ctx->level) <= 0 ||
+ ctx->level < 0 || ctx->level > 63) {
+ av_log(ctx, AV_LOG_ERROR, "Invaid level '%s'.\n",
ctx->options.level);
+ return AVERROR(EINVAL);
+ }
+
+ if(ctx->options.qp >= 0) {
+ ctx->rc_mode = VA_RC_CQP;
+ } else {
+ // Default to fixed-QP 26.
+ ctx->rc_mode = VA_RC_CQP;
+ ctx->options.qp = 26;
+ }
+ av_log(ctx, AV_LOG_INFO, "Using constant-QP mode at %d.\n",
+ ctx->options.qp);
+
+ err = av_vaapi_instance_init(&ctx->va_instance, 0);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "No VAAPI instance.\n");
+ return err;
+ }
+
+ ctx->input_width = avctx->width;
+ ctx->input_height = avctx->height;
+
+ ctx->aligned_width = (ctx->input_width + 15) / 16 * 16;
+ ctx->aligned_height = (ctx->input_height + 15) / 16 * 16;
+ ctx->ctu_width = (ctx->aligned_width + 31) / 32;
+ ctx->ctu_height = (ctx->aligned_height + 31) / 32;
+
+ ctx->fixed_qp = ctx->options.qp;
+
+ ctx->poc = -1;
+
+ {
+ AVVAAPIPipelineConfig *config = &ctx->va_config;
+
+ config->profile = ctx->va_profile;
+ config->entrypoint = VAEntrypointEncSlice;
+
+ config->attribute_count = FF_ARRAY_ELEMS(config_attributes);
+ config->attributes = config_attributes;
+ }
+
+ {
+ AVVAAPISurfaceConfig *config = &ctx->output_config;
+
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->av_format = AV_PIX_FMT_VAAPI;
+
+ config->image_format.fourcc = VA_FOURCC_NV12;
+ config->image_format.bits_per_pixel = 12;
+
+ config->count = MAX_DPB_PICS;
+ config->width = ctx->aligned_width;
+ config->height = ctx->aligned_height;
+
+ config->attribute_count = 0;
+ }
+
+ if(avctx->pix_fmt == AV_PIX_FMT_VAAPI) {
+ // Just use the input surfaces directly.
+ ctx->input_is_vaapi = 1;
+
+ } else {
+ AVVAAPISurfaceConfig *config = &ctx->input_config;
+
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->av_format = AV_PIX_FMT_VAAPI;
+
+ config->image_format.fourcc = VA_FOURCC_NV12;
+ config->image_format.bits_per_pixel = 12;
+
+ config->count = INPUT_PICS;
+ config->width = ctx->aligned_width;
+ config->height = ctx->aligned_height;
+
+ config->attribute_count = 0;
+
+ ctx->input_is_vaapi = 0;
+ }
+
+ err = av_vaapi_pipeline_init(&ctx->va_codec, &ctx->va_instance,
+ &ctx->va_config,
+ ctx->input_is_vaapi ? 0 :
&ctx->input_config,
+ &ctx->output_config);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create codec: %d (%s).\n",
+ err, av_err2str(err));
+ return err;
+ }
+
+ for(i = 0; i < MAX_DPB_PICS; i++) {
+ vas = vaCreateBuffer(ctx->va_instance.display,
+ ctx->va_codec.context_id,
+ VAEncCodedBufferType,
+ 1048576, 1, 0, &ctx->dpb[i].coded_data_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for "
+ "coded data: %d (%s).\n", vas, vaErrorStr(vas));
+ break;
+ }
+ av_log(ctx, AV_LOG_TRACE, "Coded data buffer %d is %#x.\n",
+ i, ctx->dpb[i].coded_data_id);
+ }
+
+ av_log(ctx, AV_LOG_INFO, "Started VAAPI H.265 encoder.\n");
+
+ vaapi_hevc_encode_init_stream(ctx);
+
+ return 0;
+}
+
+static av_cold int vaapi_hevc_encode_close(AVCodecContext *avctx)
+{
+ VAAPIHEVCEncodeContext *ctx = avctx->priv_data;
+ int err;
+
+ err = av_vaapi_pipeline_uninit(&ctx->va_codec);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to destroy codec: %d (%s).\n",
+ err, av_err2str(err));
+ }
+
+ err = av_vaapi_instance_uninit(&ctx->va_instance);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to uninitialised VAAPI "
+ "instance: %d (%s).\n",
+ err, av_err2str(err));
+ }
+
+ return 0;
+}
+
+#define OFFSET(member) offsetof(VAAPIHEVCEncodeContext, options.member)
+#define FLAGS (AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_ENCODING_PARAM)
+static const AVOption vaapi_hevc_options[] = {
+ { "level", "Set H.265 level",
+ OFFSET(level), AV_OPT_TYPE_STRING,
+ { .str = "52" }, 0, 0, FLAGS },
+ { "qp", "Use constant quantisation parameter",
+ OFFSET(qp), AV_OPT_TYPE_INT,
+ { .i64 = -1 }, -1, MAX_QP, FLAGS },
+ { "idr_interval", "Number of frames between IDR frames (0 = all
intra)",
+ OFFSET(idr_interval), AV_OPT_TYPE_INT,
+ { .i64 = -1 }, -1, INT_MAX, FLAGS },
+ { 0 }
+};
+
+static const AVClass vaapi_hevc_class = {
+ .class_name = "VAAPI/H.265",
+ .item_name = av_default_item_name,
+ .option = vaapi_hevc_options,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+AVCodec ff_hevc_vaapi_encoder = {
+ .name = "vaapi_hevc",
+ .long_name = NULL_IF_CONFIG_SMALL("H.265 (VAAPI)"),
+ .type = AVMEDIA_TYPE_VIDEO,
+ .id = AV_CODEC_ID_HEVC,
+ .priv_data_size = sizeof(VAAPIHEVCEncodeContext),
+ .init = &vaapi_hevc_encode_init,
+ .encode2 = &vaapi_hevc_encode_picture,
+ .close = &vaapi_hevc_encode_close,
+ .priv_class = &vaapi_hevc_class,
+ .pix_fmts = (const enum AVPixelFormat[]) {
+ AV_PIX_FMT_VAAPI,
+ AV_PIX_FMT_NV12,
+ AV_PIX_FMT_NONE,
+ },
+};
--
2.6.4
Mark Thompson
2016-01-17 17:38:12 UTC
Permalink
From 7d485b706f287ef2f0a58f8e08b092bc30d4c79a Mon Sep 17 00:00:00 2001
From: Mark Thompson <***@jkqxz.net>
Date: Sun, 17 Jan 2016 15:48:54 +0000
Subject: [PATCH 5/5] libavfilter: add VAAPI surface converter

---
configure | 1 +
libavfilter/Makefile | 1 +
libavfilter/allfilters.c | 1 +
libavfilter/vf_vaapi_conv.c | 453
++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 456 insertions(+)
create mode 100644 libavfilter/vf_vaapi_conv.c

diff --git a/configure b/configure
index 9da8e8b..71c0bc0 100755
--- a/configure
+++ b/configure
@@ -2913,6 +2913,7 @@ stereo3d_filter_deps="gpl"
subtitles_filter_deps="avformat avcodec libass"
super2xsai_filter_deps="gpl"
tinterlace_filter_deps="gpl"
+vaapi_conv_filter_deps="vaapi"
vidstabdetect_filter_deps="libvidstab"
vidstabtransform_filter_deps="libvidstab"
pixfmts_super2xsai_test_deps="super2xsai_filter"
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index e3e3561..9a4ca12 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -246,6 +246,7 @@ OBJS-$(CONFIG_TRANSPOSE_FILTER) +=
vf_transpose.o
OBJS-$(CONFIG_TRIM_FILTER) += trim.o
OBJS-$(CONFIG_UNSHARP_FILTER) += vf_unsharp.o
OBJS-$(CONFIG_USPP_FILTER) += vf_uspp.o
+OBJS-$(CONFIG_VAAPI) += vf_vaapi_conv.o
OBJS-$(CONFIG_VECTORSCOPE_FILTER) += vf_vectorscope.o
OBJS-$(CONFIG_VFLIP_FILTER) += vf_vflip.o
OBJS-$(CONFIG_VIDSTABDETECT_FILTER) += vidstabutils.o
vf_vidstabdetect.o
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 1faf393..cfbfdca 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -266,6 +266,7 @@ void avfilter_register_all(void)
REGISTER_FILTER(TRIM, trim, vf);
REGISTER_FILTER(UNSHARP, unsharp, vf);
REGISTER_FILTER(USPP, uspp, vf);
+ REGISTER_FILTER(VAAPI_CONV, vaapi_conv, vf);
REGISTER_FILTER(VECTORSCOPE, vectorscope, vf);
REGISTER_FILTER(VFLIP, vflip, vf);
REGISTER_FILTER(VIDSTABDETECT, vidstabdetect, vf);
diff --git a/libavfilter/vf_vaapi_conv.c b/libavfilter/vf_vaapi_conv.c
new file mode 100644
index 0000000..f17445d
--- /dev/null
+++ b/libavfilter/vf_vaapi_conv.c
@@ -0,0 +1,453 @@
+/*
+ * VAAPI converter (scaling and colour conversion).
+ *
+ * Copyright (C) 2016 Mark Thompson <***@jkqxz.net>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301 USA
+ */
+
+#include "avfilter.h"
+#include "formats.h"
+#include "internal.h"
+
+#include "libavutil/avassert.h"
+#include "libavutil/opt.h"
+#include "libavutil/pixdesc.h"
+#include "libavutil/vaapi.h"
+
+typedef struct VAAPIConvContext {
+ const AVClass *class;
+
+ AVVAAPIInstance va_instance;
+ AVVAAPIPipelineConfig va_config;
+ AVVAAPIPipelineContext va_context;
+ int pipeline_initialised;
+
+ int input_is_vaapi;
+ AVVAAPISurfaceConfig input_config;
+ AVVAAPISurfaceConfig output_config;
+
+ int output_width;
+ int output_height;
+
+ struct {
+ int output_size[2];
+ } options;
+
+} VAAPIConvContext;
+
+
+static int vaapi_conv_query_formats(AVFilterContext *avctx)
+{
+ VAAPIConvContext *ctx = avctx->priv;
+ VAStatus vas;
+ VAConfigAttrib rt_format = {
+ .type = VAConfigAttribRTFormat
+ };
+ enum AVPixelFormat pix_fmt_list[16] = {
+ AV_PIX_FMT_VAAPI,
+ };
+ int pix_fmt_count = 1, err;
+
+#if 0
+ // The Intel driver doesn't return anything useful here - it only
+ // declares support for YUV 4:2:0 formats, despite working perfectly
+ // with 32-bit RGB ones. Given another usable platform, this will
+ // need to be updated.
+ vas = vaGetConfigAttributes(ctx->va_instance.display,
+ VAProfileNone, VAEntrypointVideoProc,
+ &rt_format, 1);
+#else
+ vas = VA_STATUS_SUCCESS;
+ rt_format.value = VA_RT_FORMAT_YUV420 | VA_RT_FORMAT_RGB32;
+#endif
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to get config attributes: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ } else {
+ if(rt_format.value & VA_RT_FORMAT_YUV420) {
+ av_log(ctx, AV_LOG_DEBUG, "YUV420 formats supported.\n");
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_YUV420P;
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_NV12;
+ }
+ if(rt_format.value & VA_RT_FORMAT_YUV422) {
+ av_log(ctx, AV_LOG_DEBUG, "YUV422 formats supported.\n");
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_YUV422P;
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_YUYV422;
+ }
+ if(rt_format.value & VA_RT_FORMAT_YUV444) {
+ av_log(ctx, AV_LOG_DEBUG, "YUV444 formats supported.\n");
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_YUV444P;
+ }
+ if(rt_format.value & VA_RT_FORMAT_YUV400) {
+ av_log(ctx, AV_LOG_DEBUG, "Grayscale formats supported.\n");
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_GRAY8;
+ }
+ if(rt_format.value & VA_RT_FORMAT_RGB32) {
+ av_log(ctx, AV_LOG_DEBUG, "RGB32 formats supported.\n");
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_RGBA;
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_BGRA;
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_RGB0;
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_BGR0;
+ }
+ }
+
+ pix_fmt_list[pix_fmt_count] = AV_PIX_FMT_NONE;
+
+ if(avctx->inputs[0]) {
+ err = ff_formats_ref(ff_make_format_list(pix_fmt_list),
+ &avctx->inputs[0]->out_formats);
+ if(err < 0)
+ return err;
+ }
+
+ if(avctx->outputs[0]) {
+ // Truncate the list: no support for normal output yet.
+ pix_fmt_list[1] = AV_PIX_FMT_NONE;
+
+ err = ff_formats_ref(ff_make_format_list(pix_fmt_list),
+ &avctx->outputs[0]->in_formats);
+ if(err < 0)
+ return err;
+ }
+
+ return 0;
+}
+
+static int vaapi_conv_config_pipeline(VAAPIConvContext *ctx)
+{
+ AVVAAPIPipelineConfig *config = &ctx->va_config;
+ int err;
+
+ config->profile = VAProfileNone;
+ config->entrypoint = VAEntrypointVideoProc;
+
+ config->attribute_count = 0;
+
+ err = av_vaapi_pipeline_init(&ctx->va_context, &ctx->va_instance,
+ &ctx->va_config, &ctx->input_config,
+ &ctx->output_config);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create video processing "
+ "pipeline: " "%d (%s).\n", err, av_err2str(err));
+ return err;
+ }
+
+ return 0;
+}
+
+static int vaapi_conv_config_input(AVFilterLink *inlink)
+{
+ AVFilterContext *avctx = inlink->dst;
+ VAAPIConvContext *ctx = avctx->priv;
+ AVVAAPISurfaceConfig *config = &ctx->input_config;
+
+ if(inlink->format == AV_PIX_FMT_VAAPI) {
+ av_log(ctx, AV_LOG_INFO, "Input is VAAPI (using incoming
surfaces).\n");
+ ctx->input_is_vaapi = 1;
+ return 0;
+ }
+ ctx->input_is_vaapi = 0;
+
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->av_format = AV_PIX_FMT_VAAPI;
+
+ switch(inlink->format) {
+ case AV_PIX_FMT_BGR0:
+ case AV_PIX_FMT_BGRA:
+ config->image_format.fourcc = VA_FOURCC_BGRX;
+ config->image_format.byte_order = VA_LSB_FIRST;
+ config->image_format.bits_per_pixel = 32;
+ config->image_format.depth = 8;
+ config->image_format.red_mask = 0x00ff0000;
+ config->image_format.green_mask = 0x0000ff00;
+ config->image_format.blue_mask = 0x000000ff;
+ config->image_format.alpha_mask = 0x00000000;
+ break;
+
+ case AV_PIX_FMT_RGB0:
+ case AV_PIX_FMT_RGBA:
+ config->image_format.fourcc = VA_FOURCC_RGBX;
+ config->image_format.byte_order = VA_LSB_FIRST;
+ config->image_format.bits_per_pixel = 32;
+ config->image_format.depth = 8;
+ config->image_format.red_mask = 0x000000ff;
+ config->image_format.green_mask = 0x0000ff00;
+ config->image_format.blue_mask = 0x00ff0000;
+ config->image_format.alpha_mask = 0x00000000;
+ break;
+
+ case AV_PIX_FMT_NV12:
+ config->image_format.fourcc = VA_FOURCC_NV12;
+ config->image_format.bits_per_pixel = 12;
+ break;
+ case AV_PIX_FMT_YUV420P:
+ config->image_format.fourcc = VA_FOURCC_YV12;
+ config->image_format.bits_per_pixel = 12;
+ break;
+
+ default:
+ av_log(ctx, AV_LOG_ERROR, "Tried to configure with invalid input "
+ "format %s.\n", av_get_pix_fmt_name(inlink->format));
+ return AVERROR(EINVAL);
+ }
+
+ config->count = 4;
+ config->width = inlink->w;
+ config->height = inlink->h;
+
+ config->attribute_count = 0;
+
+ if(ctx->output_width == 0)
+ ctx->output_width = inlink->w;
+ if(ctx->output_height == 0)
+ ctx->output_height = inlink->h;
+
+ return 0;
+}
+
+static int vaapi_conv_config_output(AVFilterLink *outlink)
+{
+ AVFilterContext *avctx = outlink->src;
+ VAAPIConvContext *ctx = avctx->priv;
+ AVVAAPISurfaceConfig *config = &ctx->output_config;
+
+ av_assert0(outlink->format == AV_PIX_FMT_VAAPI);
+ outlink->w = ctx->output_width;
+ outlink->h = ctx->output_height;
+
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->av_format = AV_PIX_FMT_VAAPI;
+
+ config->image_format.fourcc = VA_FOURCC_NV12;
+ config->image_format.bits_per_pixel = 12;
+
+ config->count = 4;
+ config->width = outlink->w;
+ config->height = outlink->h;
+
+ config->attribute_count = 0;
+
+ return vaapi_conv_config_pipeline(ctx);
+}
+
+static int vaapi_conv_filter_frame(AVFilterLink *inlink, AVFrame *pic)
+{
+ AVFilterContext *avctx = inlink->dst;
+ AVFilterLink *outlink = avctx->outputs[0];
+ VAAPIConvContext *ctx = avctx->priv;
+ AVVAAPISurface *input, *output;
+ AVFrame *input_image, *output_image;
+ VAProcPipelineParameterBuffer params;
+ VABufferID params_id;
+ VAStatus vas;
+ int err;
+
+ av_log(ctx, AV_LOG_DEBUG, "Filter frame: %s, %ux%u.\n",
+ av_get_pix_fmt_name(pic->format), pic->width, pic->height);
+
+ if(pic->data[3]) {
+ input_image = pic;
+ input = (AVVAAPISurface*)pic->buf[0]->data;
+
+ } else {
+ input_image = av_frame_alloc();
+
+ err = av_vaapi_get_input_surface(&ctx->va_context, input_image);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface to "
+ "copy input frame: %d (%s).\n", err, av_err2str(err));
+ return err;
+ }
+
+ input = (AVVAAPISurface*)input_image->buf[0]->data;
+
+ err = av_vaapi_map_surface(input, 0);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to map input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ return err;
+ }
+
+ err = av_vaapi_copy_to_surface(pic, input);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to copy to input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ return err;
+ }
+
+ err = av_vaapi_unmap_surface(input, 1);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to unmap input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ return err;
+ }
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Using surface %#x for input image.\n",
+ input->id);
+
+ output_image = av_frame_alloc();
+ if(!output_image)
+ return AVERROR(ENOMEM);
+ av_frame_copy_props(output_image, pic);
+
+ err = av_vaapi_get_output_surface(&ctx->va_context, output_image);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface for "
+ "output frame: %d (%s).\n", err, av_err2str(err));
+ return err;
+ }
+ output = (AVVAAPISurface*)output_image->buf[0]->data;
+ av_log(ctx, AV_LOG_DEBUG, "Using surface %#x for output image.\n",
+ output->id);
+
+ memset(&params, 0, sizeof(params));
+
+ params.surface = input->id;
+ params.surface_region = 0;
+ params.surface_color_standard = VAProcColorStandardNone;
+
+ params.output_region = 0;
+ params.output_background_color = 0xff000000;
+ params.output_color_standard = VAProcColorStandardNone;
+
+ params.pipeline_flags = 0;
+ params.filter_flags = VA_FILTER_SCALING_HQ;
+
+ vas = vaBeginPicture(ctx->va_instance.display,
ctx->va_context.context_id,
+ output->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to attach new picture: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ vas = vaCreateBuffer(ctx->va_instance.display,
ctx->va_context.context_id,
+ VAProcPipelineParameterBufferType,
+ sizeof(params), 1, &params, &params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create parameter buffer: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Pipeline parameter buffer is %#x.\n",
+ params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display,
ctx->va_context.context_id,
+ &params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to render parameter buffer: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ vas = vaEndPicture(ctx->va_instance.display,
ctx->va_context.context_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to start picture processing: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ vas = vaSyncSurface(ctx->va_instance.display, output->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to sync picture completion: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ av_frame_free(&input_image);
+ if(pic->format != AV_PIX_FMT_VAAPI)
+ av_frame_free(&pic);
+
+ return ff_filter_frame(outlink, output_image);
+}
+
+static av_cold int vaapi_conv_init(AVFilterContext *avctx)
+{
+ VAAPIConvContext *ctx = avctx->priv;
+ int err;
+
+ err = av_vaapi_instance_init(&ctx->va_instance, 0);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "No VAAPI instance.\n");
+ return err;
+ }
+
+ ctx->output_width = ctx->options.output_size[0];
+ ctx->output_height = ctx->options.output_size[1];
+
+ return 0;
+}
+
+static av_cold void vaapi_conv_uninit(AVFilterContext *avctx)
+{
+ VAAPIConvContext *ctx = avctx->priv;
+
+ if(ctx->pipeline_initialised) {
+ av_vaapi_pipeline_uninit(&ctx->va_context);
+ }
+
+ av_vaapi_instance_uninit(&ctx->va_instance);
+}
+
+
+#define OFFSET(member) offsetof(VAAPIConvContext, options.member)
+#define FLAGS (AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_FILTERING_PARAM)
+static const AVOption vaapi_conv_options[] = {
+ { "size", "Set output size",
+ OFFSET(output_size), AV_OPT_TYPE_IMAGE_SIZE,
+ { 0 }, 0, 0, FLAGS },
+ { 0 },
+};
+
+static const AVClass vaapi_conv_class = {
+ .class_name = "VAAPI/conv",
+ .item_name = av_default_item_name,
+ .option = vaapi_conv_options,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+static const AVFilterPad vaapi_conv_inputs[] = {
+ {
+ .name = "default",
+ .type = AVMEDIA_TYPE_VIDEO,
+ .filter_frame = &vaapi_conv_filter_frame,
+ .config_props = &vaapi_conv_config_input,
+ },
+ { 0 }
+};
+
+static const AVFilterPad vaapi_conv_outputs[] = {
+ {
+ .name = "default",
+ .type = AVMEDIA_TYPE_VIDEO,
+ .config_props = &vaapi_conv_config_output,
+ },
+ { 0 }
+};
+
+AVFilter ff_vf_vaapi_conv = {
+ .name = "vaapi_conv",
+ .description = NULL_IF_CONFIG_SMALL("Convert to/from VAAPI
surfaces."),
+ .priv_size = sizeof(VAAPIConvContext),
+ .init = &vaapi_conv_init,
+ .uninit = &vaapi_conv_uninit,
+ .query_formats = &vaapi_conv_query_formats,
+ .inputs = vaapi_conv_inputs,
+ .outputs = vaapi_conv_outputs,
+ .priv_class = &vaapi_conv_class,
+};
--
2.6.4
Mark Thompson
2016-01-17 22:43:11 UTC
Permalink
Hi,

Here is a new version of this patchset. See previous email <http://ffmpeg.org/pipermail/ffmpeg-devel/2016-January/187305.html> for the full summary.

This fixes the main thread-safety complaint, that libva initialisation and operations weren't mutually excluded. It guards the connection initialisation globally, and then individual libva calls on a per-connection basis.

It does not yet add this locking into the existing decoder (actually this would be very easy, it only needs to go in ff_vaapi_mpeg_end_frame() in libavcodec/vaapi.c) because there is no decision as to how to communicate the necessary instance information to the decoder (see other thread).

Thanks,

- Mark


(Line wrapping should also be fixed, hopefully.)
Mark Thompson
2016-01-17 22:45:37 UTC
Permalink
From 45a803b627d0180c1aac928756924bd39ddf529d Mon Sep 17 00:00:00 2001
From: Mark Thompson <***@jkqxz.net>
Date: Sun, 17 Jan 2016 22:13:20 +0000
Subject: [PATCH 1/5] libavutil: some VAAPI infrastructure

---
configure | 4 +
libavutil/Makefile | 1 +
libavutil/vaapi.c | 782 +++++++++++++++++++++++++++++++++++++++++++++++++++++
libavutil/vaapi.h | 119 ++++++++
4 files changed, 906 insertions(+)
create mode 100644 libavutil/vaapi.c
create mode 100644 libavutil/vaapi.h

diff --git a/configure b/configure
index 7cef6f5..1c77015 100755
--- a/configure
+++ b/configure
@@ -5739,6 +5739,10 @@ enabled vaapi && enabled xlib &&
check_lib2 "va/va.h va/va_x11.h" vaGetDisplay -lva -lva-x11 &&
enable vaapi_x11

+enabled vaapi &&
+ check_lib2 "va/va.h va/va_drm.h" vaGetDisplayDRM -lva -lva-drm &&
+ enable vaapi_drm
+
enabled vdpau &&
check_cpp_condition vdpau/vdpau.h "defined VDP_DECODER_PROFILE_MPEG4_PART2_ASP" ||
disable vdpau
diff --git a/libavutil/Makefile b/libavutil/Makefile
index bf8c713..8025f9f 100644
--- a/libavutil/Makefile
+++ b/libavutil/Makefile
@@ -146,6 +146,7 @@ OBJS-$(!HAVE_ATOMICS_NATIVE) += atomic.o \

OBJS-$(CONFIG_LZO) += lzo.o
OBJS-$(CONFIG_OPENCL) += opencl.o opencl_internal.o
+OBJS-$(CONFIG_VAAPI) += vaapi.o

OBJS += $(COMPAT_OBJS:%=../compat/%)

diff --git a/libavutil/vaapi.c b/libavutil/vaapi.c
new file mode 100644
index 0000000..20bae4c
--- /dev/null
+++ b/libavutil/vaapi.c
@@ -0,0 +1,782 @@
+/*
+ * VAAPI helper functions.
+ *
+ * Copyright (C) 2016 Mark Thompson <***@jkqxz.net>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <string.h>
+
+#include <unistd.h>
+#include <fcntl.h>
+
+#include "vaapi.h"
+
+#include <va/va_x11.h>
+#include <va/va_drm.h>
+
+#include "avassert.h"
+#include "imgutils.h"
+#include "pixfmt.h"
+#include "thread.h"
+
+
+static const AVClass vaapi_connection_class = {
+ .class_name = "VAAPI/connection",
+ .item_name = av_default_item_name,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+static const AVClass vaapi_pipeline_class = {
+ .class_name = "VAAPI/pipeline",
+ .item_name = av_default_item_name,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+typedef struct AVVAAPIConnection {
+ const AVClass *class;
+
+ AVMutex lock;
+ char *device_string;
+ int refcount;
+ struct AVVAAPIConnection *next;
+
+ VADisplay display;
+ int initialised;
+ int version_major, version_minor;
+
+ enum {
+ AV_VAAPI_CONNECTION_NONE = 0,
+ AV_VAAPI_CONNECTION_DRM,
+ AV_VAAPI_CONNECTION_X11,
+ /* ?
+ AV_VAAPI_CONNECTION_GLX,
+ AV_VAAPI_CONNECTION_WAYLAND,
+ */
+ } connection_type;
+ union {
+ void *x11_display;
+ int drm_fd;
+ };
+} AVVAAPIConnection;
+
+void av_vaapi_instance_lock(AVVAAPIInstance *instance)
+{
+ AVVAAPIConnection *ctx = instance->connection;
+
+ ff_mutex_lock(&ctx->lock);
+}
+
+void av_vaapi_instance_unlock(AVVAAPIInstance *instance)
+{
+ AVVAAPIConnection *ctx = instance->connection;
+
+ ff_mutex_unlock(&ctx->lock);
+}
+
+static int vaapi_connection_uninit(AVVAAPIConnection *ctx)
+{
+ if(ctx->initialised) {
+ vaTerminate(ctx->display);
+ ctx->display = 0;
+ ctx->initialised = 0;
+ ff_mutex_destroy(&ctx->lock);
+ }
+
+ switch(ctx->connection_type) {
+
+ case AV_VAAPI_CONNECTION_DRM:
+ if(ctx->drm_fd >= 0) {
+ close(ctx->drm_fd);
+ ctx->drm_fd = -1;
+ }
+ break;
+
+ case AV_VAAPI_CONNECTION_X11:
+ if(ctx->x11_display) {
+ XCloseDisplay(ctx->x11_display);
+ ctx->x11_display = 0;
+ }
+ break;
+
+ }
+
+ return 0;
+}
+
+static int vaapi_connection_init(AVVAAPIConnection *ctx, const char *device)
+{
+ VAStatus vas;
+ int err;
+
+ ctx->class = &vaapi_connection_class;
+ if(device)
+ ctx->device_string = av_strdup(device);
+
+ // If the device name is not provided at all, assume we are in X and can
+ // connect to the display in DISPLAY. If we do get a device name and it
+ // begins with a type indicator, use that. Otherwise, try to guess the
+ // answer from the content of the name.
+ if(!device) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_X11;
+ } else if(!strncmp(device, "drm:", 4)) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_DRM;
+ device += 4;
+ } else if(!strncmp(device, "x11:", 4)) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_X11;
+ device += 4;
+ } else {
+ if(strchr(device, '/')) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_DRM;
+ } else if(strchr(device, ':')) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_X11;
+ } else {
+ // No idea, just give up.
+ return AVERROR(EINVAL);
+ }
+ }
+
+ switch(ctx->connection_type) {
+
+ case AV_VAAPI_CONNECTION_DRM:
+ ctx->drm_fd = open(device, O_RDWR);
+ if(ctx->drm_fd < 0) {
+ av_log(ctx, AV_LOG_ERROR, "Cannot open DRM device %s.\n",
+ device);
+ err = AVERROR(errno);
+ goto fail;
+ }
+ ctx->display = vaGetDisplayDRM(ctx->drm_fd);
+ if(!ctx->display) {
+ av_log(ctx, AV_LOG_ERROR, "Cannot open the VA display (from DRM "
+ "device %s).\n", device);
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+ break;
+
+ case AV_VAAPI_CONNECTION_X11:
+ ctx->x11_display = XOpenDisplay(device); // device might be NULL.
+ if(!ctx->x11_display) {
+ av_log(ctx, AV_LOG_ERROR, "Cannot open X11 display %s.\n",
+ XDisplayName(device));
+ err = AVERROR(ENOENT);
+ goto fail;
+ }
+ ctx->display = vaGetDisplay(ctx->x11_display);
+ if(!ctx->display) {
+ av_log(ctx, AV_LOG_ERROR, "Cannot open the VA display (from X11 "
+ "display %s).\n", XDisplayName(device));
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+ break;
+
+ default:
+ av_assert0(0);
+ }
+
+ ff_mutex_init(&ctx->lock, 0);
+
+ vas = vaInitialize(ctx->display,
+ &ctx->version_major, &ctx->version_minor);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to initialise VAAPI: %d (%s).\n",
+ vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+ ctx->initialised = 1;
+
+ av_log(ctx, AV_LOG_INFO, "Initialised VAAPI connection: version %d.%d\n",
+ ctx->version_major, ctx->version_minor);
+
+ return 0;
+
+ fail:
+ vaapi_connection_uninit(ctx);
+ return err;
+}
+
+static AVVAAPIConnection *vaapi_connection_list;
+static AVMutex vaapi_global_lock;
+static AVOnce vaapi_global_init_control = AV_ONCE_INIT;
+
+static void vaapi_global_init(void)
+{
+ vaapi_connection_list = 0;
+ ff_mutex_init(&vaapi_global_lock, 0);
+}
+
+int av_vaapi_instance_init(AVVAAPIInstance *instance, const char *device)
+{
+ AVVAAPIConnection *ctx;
+ int err;
+
+ ff_thread_once(&vaapi_global_init_control, &vaapi_global_init);
+
+ ff_mutex_lock(&vaapi_global_lock);
+
+ for(ctx = vaapi_connection_list; ctx; ctx = ctx->next) {
+ if((device == 0 && ctx->device_string == 0) ||
+ (device && ctx->device_string &&
+ !strcmp(device, ctx->device_string)))
+ break;
+ }
+
+ if(ctx) {
+ av_log(ctx, AV_LOG_INFO, "New VAAPI instance connected to existing "
+ "instance (%s).\n", device ? device : "default");
+ ++ctx->refcount;
+ instance->connection = ctx;
+ instance->display = ctx->display;
+ err = 0;
+ goto done;
+ }
+
+ ctx = av_mallocz(sizeof(AVVAAPIConnection));
+ if(!ctx) {
+ err = AVERROR(ENOMEM);
+ goto done;
+ }
+
+ err = vaapi_connection_init(ctx, device);
+ if(err)
+ goto done;
+
+ ctx->refcount = 1;
+
+ instance->display = ctx->display;
+ instance->connection = ctx;
+
+ ctx->next = vaapi_connection_list;
+ vaapi_connection_list = ctx;
+
+ av_log(ctx, AV_LOG_INFO, "New VAAPI instance (%s).\n",
+ device ? device : "default");
+
+ err = 0;
+ done:
+ ff_mutex_unlock(&vaapi_global_lock);
+ return err;
+}
+
+int av_vaapi_instance_uninit(AVVAAPIInstance *instance)
+{
+ AVVAAPIConnection *ctx = instance->connection;
+ int err;
+
+ ff_mutex_lock(&vaapi_global_lock);
+
+ if(!ctx) {
+ err = AVERROR(EINVAL);
+ goto done;
+ }
+
+ if(ctx->refcount <= 0) {
+ av_log(ctx, AV_LOG_ERROR, "Tried to uninit VAAPI connection with "
+ "refcount = %d < 0.\n", ctx->refcount);
+ err = AVERROR(EINVAL);
+ goto done;
+ }
+
+ --ctx->refcount;
+
+ if(ctx->refcount == 0) {
+ AVVAAPIConnection *iter, *prev;
+ prev = 0;
+ for(iter = vaapi_connection_list; iter;
+ prev = iter, iter = iter->next) {
+ if(iter == ctx) {
+ if(prev)
+ prev->next = ctx->next;
+ else
+ vaapi_connection_list = ctx->next;
+ break;
+ }
+ }
+ if(!iter) {
+ av_log(ctx, AV_LOG_WARNING, "Tried to uninit VAAPI connection "
+ "not in connection list?\n");
+ // Not fatal.
+ }
+
+ vaapi_connection_uninit(ctx);
+ av_free(ctx);
+ memset(instance, 0, sizeof(*instance));
+ }
+
+ err = 0;
+ done:
+ ff_mutex_unlock(&vaapi_global_lock);
+ return err;
+}
+
+
+static int vaapi_create_surfaces(AVVAAPIInstance *instance,
+ AVVAAPISurfaceConfig *config,
+ AVVAAPISurface *surfaces,
+ VASurfaceID *ids)
+{
+ VAStatus vas;
+ int i;
+
+ vas = vaCreateSurfaces(instance->display, config->rt_format,
+ config->width, config->height, ids, config->count,
+ config->attributes, config->attribute_count);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance->connection, AV_LOG_ERROR, "Failed to create "
+ "surfaces: %d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR(EINVAL);
+ }
+
+ for(i = 0; i < config->count; i++) {
+ surfaces[i].id = ids[i];
+ surfaces[i].refcount = 0;
+ surfaces[i].instance = instance;
+ surfaces[i].config = config;
+ av_log(instance->connection, AV_LOG_TRACE, "Created VA surface "
+ "%d: %#x.\n", i, surfaces[i].id);
+ }
+
+ return 0;
+}
+
+static void vaapi_destroy_surfaces(AVVAAPIInstance *instance,
+ AVVAAPISurfaceConfig *config,
+ AVVAAPISurface *surfaces,
+ VASurfaceID *ids)
+{
+ VAStatus vas;
+ int i;
+
+ for(i = 0; i < config->count; i++) {
+ av_assert0(surfaces[i].id == ids[i]);
+ if(surfaces[i].refcount > 0)
+ av_log(instance->connection, AV_LOG_WARNING, "Destroying "
+ "surface %#x which is still in use.\n", surfaces[i].id);
+ av_assert0(surfaces[i].instance == instance);
+ av_assert0(surfaces[i].config == config);
+ }
+
+ vas = vaDestroySurfaces(instance->display, ids, config->count);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to destroy surfaces: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ }
+}
+
+int av_vaapi_pipeline_init(AVVAAPIPipelineContext *ctx,
+ AVVAAPIInstance *instance,
+ AVVAAPIPipelineConfig *config,
+ AVVAAPISurfaceConfig *input,
+ AVVAAPISurfaceConfig *output)
+{
+ VAStatus vas;
+ int err;
+
+ // Currently this only supports a pipeline which actually creates
+ // output surfaces. An intra-only encoder (e.g. JPEG) won't, so
+ // some modification would be required to make that work.
+ if(!output)
+ return AVERROR(EINVAL);
+
+ memset(ctx, 0, sizeof(*ctx));
+ ctx->class = &vaapi_pipeline_class;
+
+ ctx->instance = instance;
+ ctx->config = config;
+
+ vas = vaCreateConfig(instance->display, config->profile,
+ config->entrypoint, config->attributes,
+ config->attribute_count, &ctx->config_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create pipeline configuration: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail_config;
+ }
+
+ if(input) {
+ ctx->input_surfaces = av_calloc(input->count, sizeof(AVVAAPISurface));
+ if(!ctx->input_surfaces) {
+ err = AVERROR(ENOMEM);
+ goto fail_alloc_input_surfaces;
+ }
+
+ err = vaapi_create_surfaces(instance, input, ctx->input_surfaces,
+ ctx->input_surface_ids);
+ if(err)
+ goto fail_create_input_surfaces;
+ ctx->input = input;
+ } else {
+ av_log(ctx, AV_LOG_INFO, "No input surfaces.\n");
+ ctx->input = 0;
+ }
+
+ if(output) {
+ ctx->output_surfaces = av_calloc(output->count, sizeof(AVVAAPISurface));
+ if(!ctx->output_surfaces) {
+ err = AVERROR(ENOMEM);
+ goto fail_alloc_output_surfaces;
+ }
+
+ err = vaapi_create_surfaces(instance, output, ctx->output_surfaces,
+ ctx->output_surface_ids);
+ if(err)
+ goto fail_create_output_surfaces;
+ ctx->output = output;
+ } else {
+ av_log(ctx, AV_LOG_INFO, "No output surfaces.\n");
+ ctx->output = 0;
+ }
+
+ vas = vaCreateContext(instance->display, ctx->config_id,
+ output->width, output->height,
+ VA_PROGRESSIVE,
+ ctx->output_surface_ids, output->count,
+ &ctx->context_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create pipeline context: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail_context;
+ }
+
+ av_log(ctx, AV_LOG_INFO, "VAAPI pipeline initialised: config %#x "
+ "context %#x.\n", ctx->config_id, ctx->context_id);
+ if(input)
+ av_log(ctx, AV_LOG_INFO, " Input: %u surfaces of %ux%u.\n",
+ input->count, input->width, input->height);
+ if(output)
+ av_log(ctx, AV_LOG_INFO, " Output: %u surfaces of %ux%u.\n",
+ output->count, output->width, output->height);
+
+ return 0;
+
+ fail_context:
+ vaapi_destroy_surfaces(instance, output, ctx->output_surfaces,
+ ctx->output_surface_ids);
+ fail_create_output_surfaces:
+ av_freep(&ctx->output_surfaces);
+ fail_alloc_output_surfaces:
+ vaapi_destroy_surfaces(instance, input, ctx->input_surfaces,
+ ctx->input_surface_ids);
+ fail_create_input_surfaces:
+ av_freep(&ctx->input_surfaces);
+ fail_alloc_input_surfaces:
+ vaDestroyConfig(instance->display, ctx->config_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to destroy pipeline "
+ "configuration: %d (%s).\n", vas, vaErrorStr(vas));
+ }
+ fail_config:
+ return err;
+}
+
+int av_vaapi_pipeline_uninit(AVVAAPIPipelineContext *ctx)
+{
+ VAStatus vas;
+
+ av_assert0(ctx->instance);
+ av_assert0(ctx->config);
+
+ vas = vaDestroyContext(ctx->instance->display, ctx->context_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to destroy pipeline context: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ }
+
+ if(ctx->output) {
+ vaapi_destroy_surfaces(ctx->instance, ctx->output,
+ ctx->output_surfaces,
+ ctx->output_surface_ids);
+ av_freep(&ctx->output_surfaces);
+ }
+
+ if(ctx->input) {
+ vaapi_destroy_surfaces(ctx->instance, ctx->input,
+ ctx->input_surfaces,
+ ctx->input_surface_ids);
+ av_freep(&ctx->input_surfaces);
+ }
+
+ vaDestroyConfig(ctx->instance->display, ctx->config_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to destroy pipeline configuration: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ }
+
+ return 0;
+}
+
+static void vaapi_codec_release_surface(void *opaque, uint8_t *data)
+{
+ AVVAAPISurface *surface = opaque;
+
+ av_assert0(surface->refcount > 0);
+ --surface->refcount;
+}
+
+static int vaapi_get_surface(AVVAAPIPipelineContext *ctx,
+ AVVAAPISurfaceConfig *config,
+ AVVAAPISurface *surfaces, AVFrame *frame)
+{
+ AVVAAPISurface *surface;
+ int i;
+
+ for(i = 0; i < config->count; i++) {
+ if(surfaces[i].refcount == 0)
+ break;
+ }
+ if(i >= config->count) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface "
+ "(%d in use).\n", config->count);
+ return AVERROR(ENOMEM);
+ }
+ surface = &surfaces[i];
+
+ ++surface->refcount;
+ frame->data[3] = (uint8_t*)(uintptr_t)surface->id;
+ frame->buf[0] = av_buffer_create((uint8_t*)surface, 0,
+ &vaapi_codec_release_surface,
+ surface, AV_BUFFER_FLAG_READONLY);
+ if(!frame->buf[0]) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate dummy buffer "
+ "for surface %#x.\n", surface->id);
+ return AVERROR(ENOMEM);
+ }
+
+ frame->format = AV_PIX_FMT_VAAPI;
+ frame->width = config->width;
+ frame->height = config->height;
+
+ return 0;
+}
+
+int av_vaapi_get_input_surface(AVVAAPIPipelineContext *ctx, AVFrame *frame)
+{
+ return vaapi_get_surface(ctx, ctx->input, ctx->input_surfaces, frame);
+}
+
+int av_vaapi_get_output_surface(AVVAAPIPipelineContext *ctx, AVFrame *frame)
+{
+ return vaapi_get_surface(ctx, ctx->output, ctx->output_surfaces, frame);
+}
+
+
+int av_vaapi_map_surface(AVVAAPISurface *surface, int get)
+{
+ AVVAAPIInstance *instance = surface->instance;
+ AVVAAPISurfaceConfig *config = surface->config;
+ VAStatus vas;
+ int err;
+ void *address;
+ // On current Intel drivers, derive gives you memory which is very slow
+ // to read (uncached?). It can be better for write-only cases, but for
+ // now play it safe and never use derive.
+ int derive = 0;
+
+ vas = vaSyncSurface(instance->display, surface->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to sync surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+
+ if(derive) {
+ vas = vaDeriveImage(instance->display,
+ surface->id, &surface->image);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to derive image from surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ derive = 0;
+ }
+ }
+ if(!derive) {
+ vas = vaCreateImage(instance->display,
+ &config->image_format,
+ config->width, config->height,
+ &surface->image);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to create image for surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+
+ if(get) {
+ vas = vaGetImage(instance->display,
+ surface->id, 0, 0,
+ config->width, config->height,
+ surface->image.image_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to get image for surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail_image;
+ }
+ }
+ }
+
+ av_assert0(surface->image.format.fourcc == config->image_format.fourcc);
+
+ vas = vaMapBuffer(instance->display,
+ surface->image.buf, &address);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to map image from surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail_image;
+ }
+
+ surface->mapped_address = address;
+
+ return 0;
+
+ fail_image:
+ vas = vaDestroyImage(instance->display, surface->image.image_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to destroy image for surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ }
+ fail:
+ return err;
+}
+
+int av_vaapi_unmap_surface(AVVAAPISurface *surface, int put)
+{
+ AVVAAPIInstance *instance = surface->instance;
+ AVVAAPISurfaceConfig *config = surface->config;
+ VAStatus vas;
+ int derive = 0;
+
+ surface->mapped_address = 0;
+
+ vas = vaUnmapBuffer(instance->display,
+ surface->image.buf);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to unmap image from surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ }
+
+ if(!derive && put) {
+ vas = vaPutImage(instance->display, surface->id,
+ surface->image.image_id,
+ 0, 0, config->width, config->height,
+ 0, 0, config->width, config->height);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to put image for surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ }
+ }
+
+ vas = vaDestroyImage(instance->display,
+ surface->image.image_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to destroy image for surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ }
+
+ return 0;
+}
+
+int av_vaapi_copy_to_surface(const AVFrame *f, AVVAAPISurface *surface)
+{
+ VAImage *image = &surface->image;
+ char *data = surface->mapped_address;
+ av_assert0(data);
+
+ switch(f->format) {
+
+ case AV_PIX_FMT_YUV420P:
+ av_assert0(image->format.fourcc == VA_FOURCC_YV12);
+ av_image_copy_plane(data + image->offsets[0], image->pitches[0],
+ f->data[0], f->linesize[0],
+ f->width, f->height);
+ av_image_copy_plane(data + image->offsets[1], image->pitches[1],
+ f->data[2], f->linesize[2],
+ f->width / 2, f->height / 2);
+ av_image_copy_plane(data + image->offsets[2], image->pitches[2],
+ f->data[1], f->linesize[1],
+ f->width / 2, f->height / 2);
+ break;
+
+ case AV_PIX_FMT_NV12:
+ av_assert0(image->format.fourcc == VA_FOURCC_NV12);
+ av_image_copy_plane(data + image->offsets[0], image->pitches[0],
+ f->data[0], f->linesize[0],
+ f->width, f->height);
+ av_image_copy_plane(data + image->offsets[1], image->pitches[1],
+ f->data[1], f->linesize[1],
+ f->width, f->height / 2);
+ break;
+
+ case AV_PIX_FMT_BGR0:
+ av_assert0(image->format.fourcc == VA_FOURCC_BGRX);
+ av_image_copy_plane(data + image->offsets[0], image->pitches[0],
+ f->data[0], f->linesize[0],
+ f->width * 4, f->height);
+ break;
+
+ default:
+ return AVERROR(EINVAL);
+ }
+
+ return 0;
+}
+
+int av_vaapi_copy_from_surface(AVFrame *f, AVVAAPISurface *surface)
+{
+ VAImage *image = &surface->image;
+ char *data = surface->mapped_address;
+ av_assert0(data);
+
+ switch(f->format) {
+
+ case AV_PIX_FMT_YUV420P:
+ av_assert0(image->format.fourcc == VA_FOURCC_YV12);
+ av_image_copy_plane(f->data[0], f->linesize[0],
+ data + image->offsets[0], image->pitches[0],
+ f->width, f->height);
+ // Um, apparently these are not the same way round...
+ av_image_copy_plane(f->data[2], f->linesize[2],
+ data + image->offsets[1], image->pitches[1],
+ f->width / 2, f->height / 2);
+ av_image_copy_plane(f->data[1], f->linesize[1],
+ data + image->offsets[2], image->pitches[2],
+ f->width / 2, f->height / 2);
+ break;
+
+ case AV_PIX_FMT_NV12:
+ av_assert0(image->format.fourcc == VA_FOURCC_NV12);
+ av_image_copy_plane(f->data[0], f->linesize[0],
+ data + image->offsets[0], image->pitches[0],
+ f->width, f->height);
+ av_image_copy_plane(f->data[1], f->linesize[1],
+ data + image->offsets[1], image->pitches[1],
+ f->width, f->height / 2);
+ break;
+
+ default:
+ return AVERROR(EINVAL);
+ }
+
+ return 0;
+}
diff --git a/libavutil/vaapi.h b/libavutil/vaapi.h
new file mode 100644
index 0000000..5238597
--- /dev/null
+++ b/libavutil/vaapi.h
@@ -0,0 +1,119 @@
+/*
+ * VAAPI helper functions.
+ *
+ * Copyright (C) 2016 Mark Thompson <***@jkqxz.net>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef LIBAVUTIL_VAAPI_H_
+#define LIBAVUTIL_VAAPI_H_
+
+#include <va/va.h>
+
+#include "pixfmt.h"
+#include "frame.h"
+
+
+typedef struct AVVAAPIInstance {
+ VADisplay display;
+
+ void *connection;
+} AVVAAPIInstance;
+
+
+int av_vaapi_instance_init(AVVAAPIInstance *ctx, const char *device);
+int av_vaapi_instance_uninit(AVVAAPIInstance *ctx);
+
+void av_vaapi_instance_lock(AVVAAPIInstance *ctx);
+void av_vaapi_instance_unlock(AVVAAPIInstance *ctx);
+
+
+#define AV_VAAPI_MAX_SURFACES 64
+
+
+typedef struct AVVAAPISurfaceConfig {
+ enum AVPixelFormat av_format;
+ unsigned int rt_format;
+ VAImageFormat image_format;
+
+ unsigned int count;
+ unsigned int width;
+ unsigned int height;
+
+ unsigned int attribute_count;
+ VASurfaceAttrib *attributes;
+} AVVAAPISurfaceConfig;
+
+typedef struct AVVAAPISurface {
+ VASurfaceID id;
+ int refcount;
+
+ VAImage image;
+ void *mapped_address;
+
+ AVVAAPIInstance *instance;
+ AVVAAPISurfaceConfig *config;
+} AVVAAPISurface;
+
+
+typedef struct AVVAAPIPipelineConfig {
+ VAProfile profile;
+ VAEntrypoint entrypoint;
+
+ unsigned int attribute_count;
+ VAConfigAttrib *attributes;
+} AVVAAPIPipelineConfig;
+
+typedef struct AVVAAPIPipelineContext {
+ const AVClass *class;
+
+ AVVAAPIInstance *instance;
+ AVVAAPIPipelineConfig *config;
+ AVVAAPISurfaceConfig *input;
+ AVVAAPISurfaceConfig *output;
+
+ VAConfigID config_id;
+ VAContextID context_id;
+
+ AVVAAPISurface *input_surfaces;
+ VASurfaceID input_surface_ids[AV_VAAPI_MAX_SURFACES];
+
+ AVVAAPISurface *output_surfaces;
+ VASurfaceID output_surface_ids[AV_VAAPI_MAX_SURFACES];
+} AVVAAPIPipelineContext;
+
+
+int av_vaapi_pipeline_init(AVVAAPIPipelineContext *ctx,
+ AVVAAPIInstance *instance,
+ AVVAAPIPipelineConfig *config,
+ AVVAAPISurfaceConfig *input,
+ AVVAAPISurfaceConfig *output);
+int av_vaapi_pipeline_uninit(AVVAAPIPipelineContext *ctx);
+
+int av_vaapi_get_input_surface(AVVAAPIPipelineContext *ctx, AVFrame *frame);
+int av_vaapi_get_output_surface(AVVAAPIPipelineContext *ctx, AVFrame *frame);
+
+int av_vaapi_map_surface(AVVAAPISurface *surface, int get);
+int av_vaapi_unmap_surface(AVVAAPISurface *surface, int put);
+
+
+int av_vaapi_copy_to_surface(const AVFrame *f, AVVAAPISurface *surface);
+int av_vaapi_copy_from_surface(AVFrame *f, AVVAAPISurface *surface);
+
+
+#endif /* LIBAVUTIL_VAAPI_H_ */
--
2.6.4
Michael Niedermayer
2016-01-17 23:55:47 UTC
Permalink
Post by Mark Thompson
From 45a803b627d0180c1aac928756924bd39ddf529d Mon Sep 17 00:00:00 2001
Date: Sun, 17 Jan 2016 22:13:20 +0000
Subject: [PATCH 1/5] libavutil: some VAAPI infrastructure
---
configure | 4 +
libavutil/Makefile | 1 +
libavutil/vaapi.c | 782 +++++++++++++++++++++++++++++++++++++++++++++++++++++
libavutil/vaapi.h | 119 ++++++++
4 files changed, 906 insertions(+)
create mode 100644 libavutil/vaapi.c
create mode 100644 libavutil/vaapi.h
tried to apply locally:

Applying: libavutil: some VAAPI infrastructure
Using index info to reconstruct a base tree...
error: patch failed: configure:5739
error: configure: patch does not apply
error: patch failed: libavutil/Makefile:146
error: libavutil/Makefile: patch does not apply
Did you hand edit your patch?
It does not apply to blobs recorded in its index.
Cannot fall back to three-way merge.
Patch failed at 0001 libavutil: some VAAPI infrastructure
When you have resolved this problem run "git am --resolved".
If you would prefer to skip this patch, instead run "git am --skip".
To restore the original branch and stop patching run "git am --abort".

[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Asymptotically faster algorithms should always be preferred if you have
asymptotical amounts of data
Mark Thompson
2016-01-18 01:07:39 UTC
Permalink
Post by Michael Niedermayer
Post by Mark Thompson
From 45a803b627d0180c1aac928756924bd39ddf529d Mon Sep 17 00:00:00 2001
Date: Sun, 17 Jan 2016 22:13:20 +0000
Subject: [PATCH 1/5] libavutil: some VAAPI infrastructure
---
configure | 4 +
libavutil/Makefile | 1 +
libavutil/vaapi.c | 782 +++++++++++++++++++++++++++++++++++++++++++++++++++++
libavutil/vaapi.h | 119 ++++++++
4 files changed, 906 insertions(+)
create mode 100644 libavutil/vaapi.c
create mode 100644 libavutil/vaapi.h
Applying: libavutil: some VAAPI infrastructure
Using index info to reconstruct a base tree...
error: patch failed: configure:5739
error: configure: patch does not apply
error: patch failed: libavutil/Makefile:146
error: libavutil/Makefile: patch does not apply
Did you hand edit your patch?
It does not apply to blobs recorded in its index.
Cannot fall back to three-way merge.
Patch failed at 0001 libavutil: some VAAPI infrastructure
When you have resolved this problem run "git am --resolved".
If you would prefer to skip this patch, instead run "git am --skip".
To restore the original branch and stop patching run "git am --abort".
Sorry, MUA fail again. Push it through 's/^ //' before git am or similar. (Or copy out of the client directly to remove the format=flowed.)

Hopefully good next time...

Thanks,

- Mark
Hendrik Leppkes
2016-01-18 00:28:54 UTC
Permalink
Post by Mark Thompson
From 45a803b627d0180c1aac928756924bd39ddf529d Mon Sep 17 00:00:00 2001
Date: Sun, 17 Jan 2016 22:13:20 +0000
Subject: [PATCH 1/5] libavutil: some VAAPI infrastructure
---
configure | 4 +
libavutil/Makefile | 1 +
libavutil/vaapi.c | 782
+++++++++++++++++++++++++++++++++++++++++++++++++++++
libavutil/vaapi.h | 119 ++++++++
4 files changed, 906 insertions(+)
create mode 100644 libavutil/vaapi.c
create mode 100644 libavutil/vaapi.h
diff --git a/configure b/configure
index 7cef6f5..1c77015 100755
--- a/configure
+++ b/configure
@@ -5739,6 +5739,10 @@ enabled vaapi && enabled xlib &&
check_lib2 "va/va.h va/va_x11.h" vaGetDisplay -lva -lva-x11 &&
enable vaapi_x11
+enabled vaapi &&
+ check_lib2 "va/va.h va/va_drm.h" vaGetDisplayDRM -lva -lva-drm &&
+ enable vaapi_drm
+
enabled vdpau &&
check_cpp_condition vdpau/vdpau.h "defined
VDP_DECODER_PROFILE_MPEG4_PART2_ASP" ||
disable vdpau
diff --git a/libavutil/Makefile b/libavutil/Makefile
index bf8c713..8025f9f 100644
--- a/libavutil/Makefile
+++ b/libavutil/Makefile
@@ -146,6 +146,7 @@ OBJS-$(!HAVE_ATOMICS_NATIVE) += atomic.o
\
OBJS-$(CONFIG_LZO) += lzo.o
OBJS-$(CONFIG_OPENCL) += opencl.o opencl_internal.o
+OBJS-$(CONFIG_VAAPI) += vaapi.o
OBJS += $(COMPAT_OBJS:%=../compat/%)
diff --git a/libavutil/vaapi.c b/libavutil/vaapi.c
new file mode 100644
index 0000000..20bae4c
--- /dev/null
+++ b/libavutil/vaapi.c
@@ -0,0 +1,782 @@
+/*
+ * VAAPI helper functions.
+ *
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <string.h>
+
+#include <unistd.h>
+#include <fcntl.h>
+
+#include "vaapi.h"
+
+#include <va/va_x11.h>
+#include <va/va_drm.h>
+
+#include "avassert.h"
+#include "imgutils.h"
+#include "pixfmt.h"
+#include "thread.h"
+
+
+static const AVClass vaapi_connection_class = {
+ .class_name = "VAAPI/connection",
+ .item_name = av_default_item_name,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+static const AVClass vaapi_pipeline_class = {
+ .class_name = "VAAPI/pipeline",
+ .item_name = av_default_item_name,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+typedef struct AVVAAPIConnection {
+ const AVClass *class;
+
+ AVMutex lock;
+ char *device_string;
+ int refcount;
+ struct AVVAAPIConnection *next;
+
+ VADisplay display;
+ int initialised;
+ int version_major, version_minor;
+
+ enum {
+ AV_VAAPI_CONNECTION_NONE = 0,
+ AV_VAAPI_CONNECTION_DRM,
+ AV_VAAPI_CONNECTION_X11,
+ /* ?
+ AV_VAAPI_CONNECTION_GLX,
+ AV_VAAPI_CONNECTION_WAYLAND,
+ */
+ } connection_type;
+ union {
+ void *x11_display;
+ int drm_fd;
+ };
+} AVVAAPIConnection;
+
+void av_vaapi_instance_lock(AVVAAPIInstance *instance)
+{
+ AVVAAPIConnection *ctx = instance->connection;
+
+ ff_mutex_lock(&ctx->lock);
+}
+
+void av_vaapi_instance_unlock(AVVAAPIInstance *instance)
+{
+ AVVAAPIConnection *ctx = instance->connection;
+
+ ff_mutex_unlock(&ctx->lock);
+}
+
+static int vaapi_connection_uninit(AVVAAPIConnection *ctx)
+{
+ if(ctx->initialised) {
+ vaTerminate(ctx->display);
+ ctx->display = 0;
+ ctx->initialised = 0;
+ ff_mutex_destroy(&ctx->lock);
+ }
+
+ switch(ctx->connection_type) {
+
+ if(ctx->drm_fd >= 0) {
+ close(ctx->drm_fd);
+ ctx->drm_fd = -1;
+ }
+ break;
+
+ if(ctx->x11_display) {
+ XCloseDisplay(ctx->x11_display);
+ ctx->x11_display = 0;
+ }
+ break;
+
+ }
+
+ return 0;
+}
+
+static int vaapi_connection_init(AVVAAPIConnection *ctx, const char *device)
+{
+ VAStatus vas;
+ int err;
+
+ ctx->class = &vaapi_connection_class;
+ if(device)
+ ctx->device_string = av_strdup(device);
+
+ // If the device name is not provided at all, assume we are in X and can
+ // connect to the display in DISPLAY. If we do get a device name and it
+ // begins with a type indicator, use that. Otherwise, try to guess the
+ // answer from the content of the name.
+ if(!device) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_X11;
+ } else if(!strncmp(device, "drm:", 4)) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_DRM;
+ device += 4;
+ } else if(!strncmp(device, "x11:", 4)) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_X11;
+ device += 4;
+ } else {
+ if(strchr(device, '/')) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_DRM;
+ } else if(strchr(device, ':')) {
+ ctx->connection_type = AV_VAAPI_CONNECTION_X11;
+ } else {
+ // No idea, just give up.
+ return AVERROR(EINVAL);
+ }
+ }
+
+ switch(ctx->connection_type) {
+
+ ctx->drm_fd = open(device, O_RDWR);
+ if(ctx->drm_fd < 0) {
+ av_log(ctx, AV_LOG_ERROR, "Cannot open DRM device %s.\n",
+ device);
+ err = AVERROR(errno);
+ goto fail;
+ }
+ ctx->display = vaGetDisplayDRM(ctx->drm_fd);
+ if(!ctx->display) {
+ av_log(ctx, AV_LOG_ERROR, "Cannot open the VA display (from DRM "
+ "device %s).\n", device);
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+ break;
+
+ ctx->x11_display = XOpenDisplay(device); // device might be NULL.
+ if(!ctx->x11_display) {
+ av_log(ctx, AV_LOG_ERROR, "Cannot open X11 display %s.\n",
+ XDisplayName(device));
+ err = AVERROR(ENOENT);
+ goto fail;
+ }
+ ctx->display = vaGetDisplay(ctx->x11_display);
+ if(!ctx->display) {
+ av_log(ctx, AV_LOG_ERROR, "Cannot open the VA display (from X11 "
+ "display %s).\n", XDisplayName(device));
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+ break;
+
+ av_assert0(0);
+ }
+
+ ff_mutex_init(&ctx->lock, 0);
+
+ vas = vaInitialize(ctx->display,
+ &ctx->version_major, &ctx->version_minor);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to initialise VAAPI: %d (%s).\n",
+ vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+ ctx->initialised = 1;
+
+ av_log(ctx, AV_LOG_INFO, "Initialised VAAPI connection: version %d.%d\n",
+ ctx->version_major, ctx->version_minor);
+
+ return 0;
+
+ vaapi_connection_uninit(ctx);
+ return err;
+}
+
+static AVVAAPIConnection *vaapi_connection_list;
+static AVMutex vaapi_global_lock;
+static AVOnce vaapi_global_init_control = AV_ONCE_INIT;
There is still global state here, which is a no-no.
Post by Mark Thompson
+
+static void vaapi_global_init(void)
+{
+ vaapi_connection_list = 0;
+ ff_mutex_init(&vaapi_global_lock, 0);
+}
+
+int av_vaapi_instance_init(AVVAAPIInstance *instance, const char *device)
+{
+ AVVAAPIConnection *ctx;
+ int err;
+
+ ff_thread_once(&vaapi_global_init_control, &vaapi_global_init);
+
+ ff_mutex_lock(&vaapi_global_lock);
+
+ for(ctx = vaapi_connection_list; ctx; ctx = ctx->next) {
+ if((device == 0 && ctx->device_string == 0) ||
+ (device && ctx->device_string &&
+ !strcmp(device, ctx->device_string)))
+ break;
+ }
+
+ if(ctx) {
+ av_log(ctx, AV_LOG_INFO, "New VAAPI instance connected to existing "
+ "instance (%s).\n", device ? device : "default");
+ ++ctx->refcount;
+ instance->connection = ctx;
+ instance->display = ctx->display;
+ err = 0;
+ goto done;
+ }
+
+ ctx = av_mallocz(sizeof(AVVAAPIConnection));
+ if(!ctx) {
+ err = AVERROR(ENOMEM);
+ goto done;
+ }
+
+ err = vaapi_connection_init(ctx, device);
+ if(err)
+ goto done;
+
+ ctx->refcount = 1;
+
+ instance->display = ctx->display;
+ instance->connection = ctx;
+
+ ctx->next = vaapi_connection_list;
+ vaapi_connection_list = ctx;
+
+ av_log(ctx, AV_LOG_INFO, "New VAAPI instance (%s).\n",
+ device ? device : "default");
+
+ err = 0;
+ ff_mutex_unlock(&vaapi_global_lock);
+ return err;
+}
+
+int av_vaapi_instance_uninit(AVVAAPIInstance *instance)
+{
+ AVVAAPIConnection *ctx = instance->connection;
+ int err;
+
+ ff_mutex_lock(&vaapi_global_lock);
+
+ if(!ctx) {
+ err = AVERROR(EINVAL);
+ goto done;
+ }
+
+ if(ctx->refcount <= 0) {
+ av_log(ctx, AV_LOG_ERROR, "Tried to uninit VAAPI connection with "
+ "refcount = %d < 0.\n", ctx->refcount);
+ err = AVERROR(EINVAL);
+ goto done;
+ }
+
+ --ctx->refcount;
+
+ if(ctx->refcount == 0) {
+ AVVAAPIConnection *iter, *prev;
+ prev = 0;
+ for(iter = vaapi_connection_list; iter;
+ prev = iter, iter = iter->next) {
+ if(iter == ctx) {
+ if(prev)
+ prev->next = ctx->next;
+ else
+ vaapi_connection_list = ctx->next;
+ break;
+ }
+ }
+ if(!iter) {
+ av_log(ctx, AV_LOG_WARNING, "Tried to uninit VAAPI connection "
+ "not in connection list?\n");
+ // Not fatal.
+ }
+
+ vaapi_connection_uninit(ctx);
+ av_free(ctx);
+ memset(instance, 0, sizeof(*instance));
+ }
+
+ err = 0;
+ ff_mutex_unlock(&vaapi_global_lock);
+ return err;
+}
+
+
+static int vaapi_create_surfaces(AVVAAPIInstance *instance,
+ AVVAAPISurfaceConfig *config,
+ AVVAAPISurface *surfaces,
+ VASurfaceID *ids)
+{
+ VAStatus vas;
+ int i;
+
+ vas = vaCreateSurfaces(instance->display, config->rt_format,
+ config->width, config->height, ids, config->count,
+ config->attributes, config->attribute_count);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance->connection, AV_LOG_ERROR, "Failed to create "
+ "surfaces: %d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR(EINVAL);
+ }
+
+ for(i = 0; i < config->count; i++) {
+ surfaces[i].id = ids[i];
+ surfaces[i].refcount = 0;
+ surfaces[i].instance = instance;
+ surfaces[i].config = config;
+ av_log(instance->connection, AV_LOG_TRACE, "Created VA surface "
+ "%d: %#x.\n", i, surfaces[i].id);
+ }
+
+ return 0;
+}
+
+static void vaapi_destroy_surfaces(AVVAAPIInstance *instance,
+ AVVAAPISurfaceConfig *config,
+ AVVAAPISurface *surfaces,
+ VASurfaceID *ids)
+{
+ VAStatus vas;
+ int i;
+
+ for(i = 0; i < config->count; i++) {
+ av_assert0(surfaces[i].id == ids[i]);
+ if(surfaces[i].refcount > 0)
+ av_log(instance->connection, AV_LOG_WARNING, "Destroying "
+ "surface %#x which is still in use.\n", surfaces[i].id);
+ av_assert0(surfaces[i].instance == instance);
+ av_assert0(surfaces[i].config == config);
+ }
+
+ vas = vaDestroySurfaces(instance->display, ids, config->count);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to destroy surfaces: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ }
+}
+
+int av_vaapi_pipeline_init(AVVAAPIPipelineContext *ctx,
+ AVVAAPIInstance *instance,
+ AVVAAPIPipelineConfig *config,
+ AVVAAPISurfaceConfig *input,
+ AVVAAPISurfaceConfig *output)
+{
+ VAStatus vas;
+ int err;
+
+ // Currently this only supports a pipeline which actually creates
+ // output surfaces. An intra-only encoder (e.g. JPEG) won't, so
+ // some modification would be required to make that work.
+ if(!output)
+ return AVERROR(EINVAL);
+
+ memset(ctx, 0, sizeof(*ctx));
+ ctx->class = &vaapi_pipeline_class;
+
+ ctx->instance = instance;
+ ctx->config = config;
+
+ vas = vaCreateConfig(instance->display, config->profile,
+ config->entrypoint, config->attributes,
+ config->attribute_count, &ctx->config_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create pipeline configuration: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail_config;
+ }
+
+ if(input) {
+ ctx->input_surfaces = av_calloc(input->count,
sizeof(AVVAAPISurface));
+ if(!ctx->input_surfaces) {
+ err = AVERROR(ENOMEM);
+ goto fail_alloc_input_surfaces;
+ }
+
+ err = vaapi_create_surfaces(instance, input, ctx->input_surfaces,
+ ctx->input_surface_ids);
+ if(err)
+ goto fail_create_input_surfaces;
+ ctx->input = input;
+ } else {
+ av_log(ctx, AV_LOG_INFO, "No input surfaces.\n");
+ ctx->input = 0;
+ }
+
+ if(output) {
+ ctx->output_surfaces = av_calloc(output->count,
sizeof(AVVAAPISurface));
+ if(!ctx->output_surfaces) {
+ err = AVERROR(ENOMEM);
+ goto fail_alloc_output_surfaces;
+ }
+
+ err = vaapi_create_surfaces(instance, output, ctx->output_surfaces,
+ ctx->output_surface_ids);
+ if(err)
+ goto fail_create_output_surfaces;
+ ctx->output = output;
+ } else {
+ av_log(ctx, AV_LOG_INFO, "No output surfaces.\n");
+ ctx->output = 0;
+ }
+
+ vas = vaCreateContext(instance->display, ctx->config_id,
+ output->width, output->height,
+ VA_PROGRESSIVE,
+ ctx->output_surface_ids, output->count,
+ &ctx->context_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create pipeline context: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail_context;
+ }
+
+ av_log(ctx, AV_LOG_INFO, "VAAPI pipeline initialised: config %#x "
+ "context %#x.\n", ctx->config_id, ctx->context_id);
+ if(input)
+ av_log(ctx, AV_LOG_INFO, " Input: %u surfaces of %ux%u.\n",
+ input->count, input->width, input->height);
+ if(output)
+ av_log(ctx, AV_LOG_INFO, " Output: %u surfaces of %ux%u.\n",
+ output->count, output->width, output->height);
+
+ return 0;
+
+ vaapi_destroy_surfaces(instance, output, ctx->output_surfaces,
+ ctx->output_surface_ids);
+ av_freep(&ctx->output_surfaces);
+ vaapi_destroy_surfaces(instance, input, ctx->input_surfaces,
+ ctx->input_surface_ids);
+ av_freep(&ctx->input_surfaces);
+ vaDestroyConfig(instance->display, ctx->config_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to destroy pipeline "
+ "configuration: %d (%s).\n", vas, vaErrorStr(vas));
+ }
+ return err;
+}
+
+int av_vaapi_pipeline_uninit(AVVAAPIPipelineContext *ctx)
+{
+ VAStatus vas;
+
+ av_assert0(ctx->instance);
+ av_assert0(ctx->config);
+
+ vas = vaDestroyContext(ctx->instance->display, ctx->context_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to destroy pipeline context: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ }
+
+ if(ctx->output) {
+ vaapi_destroy_surfaces(ctx->instance, ctx->output,
+ ctx->output_surfaces,
+ ctx->output_surface_ids);
+ av_freep(&ctx->output_surfaces);
+ }
+
+ if(ctx->input) {
+ vaapi_destroy_surfaces(ctx->instance, ctx->input,
+ ctx->input_surfaces,
+ ctx->input_surface_ids);
+ av_freep(&ctx->input_surfaces);
+ }
+
+ vaDestroyConfig(ctx->instance->display, ctx->config_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to destroy pipeline
configuration: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ }
+
+ return 0;
+}
+
+static void vaapi_codec_release_surface(void *opaque, uint8_t *data)
+{
+ AVVAAPISurface *surface = opaque;
+
+ av_assert0(surface->refcount > 0);
+ --surface->refcount;
+}
+
+static int vaapi_get_surface(AVVAAPIPipelineContext *ctx,
+ AVVAAPISurfaceConfig *config,
+ AVVAAPISurface *surfaces, AVFrame *frame)
+{
+ AVVAAPISurface *surface;
+ int i;
+
+ for(i = 0; i < config->count; i++) {
+ if(surfaces[i].refcount == 0)
+ break;
+ }
+ if(i >= config->count) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface "
+ "(%d in use).\n", config->count);
+ return AVERROR(ENOMEM);
+ }
+ surface = &surfaces[i];
+
+ ++surface->refcount;
+ frame->data[3] = (uint8_t*)(uintptr_t)surface->id;
+ frame->buf[0] = av_buffer_create((uint8_t*)surface, 0,
+ &vaapi_codec_release_surface,
+ surface, AV_BUFFER_FLAG_READONLY);
+ if(!frame->buf[0]) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate dummy buffer "
+ "for surface %#x.\n", surface->id);
+ return AVERROR(ENOMEM);
+ }
+
+ frame->format = AV_PIX_FMT_VAAPI;
+ frame->width = config->width;
+ frame->height = config->height;
+
+ return 0;
+}
+
+int av_vaapi_get_input_surface(AVVAAPIPipelineContext *ctx, AVFrame *frame)
+{
+ return vaapi_get_surface(ctx, ctx->input, ctx->input_surfaces, frame);
+}
+
+int av_vaapi_get_output_surface(AVVAAPIPipelineContext *ctx, AVFrame *frame)
+{
+ return vaapi_get_surface(ctx, ctx->output, ctx->output_surfaces, frame);
+}
+
+
+int av_vaapi_map_surface(AVVAAPISurface *surface, int get)
+{
+ AVVAAPIInstance *instance = surface->instance;
+ AVVAAPISurfaceConfig *config = surface->config;
+ VAStatus vas;
+ int err;
+ void *address;
+ // On current Intel drivers, derive gives you memory which is very slow
+ // to read (uncached?). It can be better for write-only cases, but for
+ // now play it safe and never use derive.
+ int derive = 0;
+
+ vas = vaSyncSurface(instance->display, surface->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to sync surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+
+ if(derive) {
+ vas = vaDeriveImage(instance->display,
+ surface->id, &surface->image);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to derive image from surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ derive = 0;
+ }
+ }
+ if(!derive) {
+ vas = vaCreateImage(instance->display,
+ &config->image_format,
+ config->width, config->height,
+ &surface->image);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to create image for surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail;
+ }
+
+ if(get) {
+ vas = vaGetImage(instance->display,
+ surface->id, 0, 0,
+ config->width, config->height,
+ surface->image.image_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to get image for surface "
+ "%#x: %d (%s).\n", surface->id, vas,
vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail_image;
+ }
+ }
+ }
+
+ av_assert0(surface->image.format.fourcc ==
config->image_format.fourcc);
+
+ vas = vaMapBuffer(instance->display,
+ surface->image.buf, &address);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to map image from surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ err = AVERROR(EINVAL);
+ goto fail_image;
+ }
+
+ surface->mapped_address = address;
+
+ return 0;
+
+ vas = vaDestroyImage(instance->display, surface->image.image_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to destroy image for surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ }
+ return err;
+}
+
+int av_vaapi_unmap_surface(AVVAAPISurface *surface, int put)
+{
+ AVVAAPIInstance *instance = surface->instance;
+ AVVAAPISurfaceConfig *config = surface->config;
+ VAStatus vas;
+ int derive = 0;
+
+ surface->mapped_address = 0;
+
+ vas = vaUnmapBuffer(instance->display,
+ surface->image.buf);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to unmap image from surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ }
+
+ if(!derive && put) {
+ vas = vaPutImage(instance->display, surface->id,
+ surface->image.image_id,
+ 0, 0, config->width, config->height,
+ 0, 0, config->width, config->height);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to put image for surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ }
+ }
+
+ vas = vaDestroyImage(instance->display,
+ surface->image.image_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(instance, AV_LOG_ERROR, "Failed to destroy image for surface "
+ "%#x: %d (%s).\n", surface->id, vas, vaErrorStr(vas));
+ }
+
+ return 0;
+}
+
+int av_vaapi_copy_to_surface(const AVFrame *f, AVVAAPISurface *surface)
+{
+ VAImage *image = &surface->image;
+ char *data = surface->mapped_address;
+ av_assert0(data);
+
+ switch(f->format) {
+
+ av_assert0(image->format.fourcc == VA_FOURCC_YV12);
+ av_image_copy_plane(data + image->offsets[0], image->pitches[0],
+ f->data[0], f->linesize[0],
+ f->width, f->height);
+ av_image_copy_plane(data + image->offsets[1], image->pitches[1],
+ f->data[2], f->linesize[2],
+ f->width / 2, f->height / 2);
+ av_image_copy_plane(data + image->offsets[2], image->pitches[2],
+ f->data[1], f->linesize[1],
+ f->width / 2, f->height / 2);
+ break;
+
+ av_assert0(image->format.fourcc == VA_FOURCC_NV12);
+ av_image_copy_plane(data + image->offsets[0], image->pitches[0],
+ f->data[0], f->linesize[0],
+ f->width, f->height);
+ av_image_copy_plane(data + image->offsets[1], image->pitches[1],
+ f->data[1], f->linesize[1],
+ f->width, f->height / 2);
+ break;
+
+ av_assert0(image->format.fourcc == VA_FOURCC_BGRX);
+ av_image_copy_plane(data + image->offsets[0], image->pitches[0],
+ f->data[0], f->linesize[0],
+ f->width * 4, f->height);
+ break;
+
+ return AVERROR(EINVAL);
+ }
+
+ return 0;
+}
+
+int av_vaapi_copy_from_surface(AVFrame *f, AVVAAPISurface *surface)
+{
+ VAImage *image = &surface->image;
+ char *data = surface->mapped_address;
+ av_assert0(data);
+
+ switch(f->format) {
+
+ av_assert0(image->format.fourcc == VA_FOURCC_YV12);
+ av_image_copy_plane(f->data[0], f->linesize[0],
+ data + image->offsets[0], image->pitches[0],
+ f->width, f->height);
+ // Um, apparently these are not the same way round...
+ av_image_copy_plane(f->data[2], f->linesize[2],
+ data + image->offsets[1], image->pitches[1],
+ f->width / 2, f->height / 2);
+ av_image_copy_plane(f->data[1], f->linesize[1],
+ data + image->offsets[2], image->pitches[2],
+ f->width / 2, f->height / 2);
+ break;
+
+ av_assert0(image->format.fourcc == VA_FOURCC_NV12);
+ av_image_copy_plane(f->data[0], f->linesize[0],
+ data + image->offsets[0], image->pitches[0],
+ f->width, f->height);
+ av_image_copy_plane(f->data[1], f->linesize[1],
+ data + image->offsets[1], image->pitches[1],
+ f->width, f->height / 2);
+ break;
+
+ return AVERROR(EINVAL);
+ }
+
+ return 0;
+}
diff --git a/libavutil/vaapi.h b/libavutil/vaapi.h
new file mode 100644
index 0000000..5238597
--- /dev/null
+++ b/libavutil/vaapi.h
@@ -0,0 +1,119 @@
+/*
+ * VAAPI helper functions.
+ *
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef LIBAVUTIL_VAAPI_H_
+#define LIBAVUTIL_VAAPI_H_
+
+#include <va/va.h>
+
+#include "pixfmt.h"
+#include "frame.h"
+
+
+typedef struct AVVAAPIInstance {
+ VADisplay display;
+
+ void *connection;
+} AVVAAPIInstance;
+
+
+int av_vaapi_instance_init(AVVAAPIInstance *ctx, const char *device);
+int av_vaapi_instance_uninit(AVVAAPIInstance *ctx);
+
+void av_vaapi_instance_lock(AVVAAPIInstance *ctx);
+void av_vaapi_instance_unlock(AVVAAPIInstance *ctx);
+
+
+#define AV_VAAPI_MAX_SURFACES 64
+
+
+typedef struct AVVAAPISurfaceConfig {
+ enum AVPixelFormat av_format;
+ unsigned int rt_format;
+ VAImageFormat image_format;
+
+ unsigned int count;
+ unsigned int width;
+ unsigned int height;
+
+ unsigned int attribute_count;
+ VASurfaceAttrib *attributes;
+} AVVAAPISurfaceConfig;
+
+typedef struct AVVAAPISurface {
+ VASurfaceID id;
+ int refcount;
+
+ VAImage image;
+ void *mapped_address;
+
+ AVVAAPIInstance *instance;
+ AVVAAPISurfaceConfig *config;
+} AVVAAPISurface;
+
+
+typedef struct AVVAAPIPipelineConfig {
+ VAProfile profile;
+ VAEntrypoint entrypoint;
+
+ unsigned int attribute_count;
+ VAConfigAttrib *attributes;
+} AVVAAPIPipelineConfig;
+
+typedef struct AVVAAPIPipelineContext {
+ const AVClass *class;
+
+ AVVAAPIInstance *instance;
+ AVVAAPIPipelineConfig *config;
+ AVVAAPISurfaceConfig *input;
+ AVVAAPISurfaceConfig *output;
+
+ VAConfigID config_id;
+ VAContextID context_id;
+
+ AVVAAPISurface *input_surfaces;
+ VASurfaceID input_surface_ids[AV_VAAPI_MAX_SURFACES];
+
+ AVVAAPISurface *output_surfaces;
+ VASurfaceID output_surface_ids[AV_VAAPI_MAX_SURFACES];
+} AVVAAPIPipelineContext;
+
+
+int av_vaapi_pipeline_init(AVVAAPIPipelineContext *ctx,
+ AVVAAPIInstance *instance,
+ AVVAAPIPipelineConfig *config,
+ AVVAAPISurfaceConfig *input,
+ AVVAAPISurfaceConfig *output);
+int av_vaapi_pipeline_uninit(AVVAAPIPipelineContext *ctx);
+
+int av_vaapi_get_input_surface(AVVAAPIPipelineContext *ctx, AVFrame *frame);
+int av_vaapi_get_output_surface(AVVAAPIPipelineContext *ctx, AVFrame *frame);
+
+int av_vaapi_map_surface(AVVAAPISurface *surface, int get);
+int av_vaapi_unmap_surface(AVVAAPISurface *surface, int put);
+
+
+int av_vaapi_copy_to_surface(const AVFrame *f, AVVAAPISurface *surface);
+int av_vaapi_copy_from_surface(AVFrame *f, AVVAAPISurface *surface);
+
+
+#endif /* LIBAVUTIL_VAAPI_H_ */
--
2.6.4
_______________________________________________
ffmpeg-devel mailing list
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Mark Thompson
2016-01-17 22:46:59 UTC
Permalink
From 845be894e8e0e2a966b6e51229cad1aadca36704 Mon Sep 17 00:00:00 2001
From: Mark Thompson <***@jkqxz.net>
Date: Sun, 17 Jan 2016 22:13:45 +0000
Subject: [PATCH 2/5] ffmpeg: hwaccel helper for VAAPI decode

---
Makefile | 1 +
ffmpeg.h | 2 +
ffmpeg_opt.c | 3 +
ffmpeg_vaapi.c | 538 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 544 insertions(+)
create mode 100644 ffmpeg_vaapi.c

diff --git a/Makefile b/Makefile
index 7836a20..be1d2ca 100644
--- a/Makefile
+++ b/Makefile
@@ -36,6 +36,7 @@ OBJS-ffmpeg-$(CONFIG_VDA) += ffmpeg_videotoolbox.o
endif
OBJS-ffmpeg-$(CONFIG_VIDEOTOOLBOX) += ffmpeg_videotoolbox.o
OBJS-ffmpeg-$(CONFIG_LIBMFX) += ffmpeg_qsv.o
+OBJS-ffmpeg-$(CONFIG_VAAPI) += ffmpeg_vaapi.o
OBJS-ffserver += ffserver_config.o

TESTTOOLS = audiogen videogen rotozoom tiny_psnr tiny_ssim base64
diff --git a/ffmpeg.h b/ffmpeg.h
index 20322b0..d7313c3 100644
--- a/ffmpeg.h
+++ b/ffmpeg.h
@@ -65,6 +65,7 @@ enum HWAccelID {
HWACCEL_VDA,
HWACCEL_VIDEOTOOLBOX,
HWACCEL_QSV,
+ HWACCEL_VAAPI,
};

typedef struct HWAccel {
@@ -577,5 +578,6 @@ int vda_init(AVCodecContext *s);
int videotoolbox_init(AVCodecContext *s);
int qsv_init(AVCodecContext *s);
int qsv_transcode_init(OutputStream *ost);
+int vaapi_decode_init(AVCodecContext *s);

#endif /* FFMPEG_H */
diff --git a/ffmpeg_opt.c b/ffmpeg_opt.c
index 9b341cf..394f2cb 100644
--- a/ffmpeg_opt.c
+++ b/ffmpeg_opt.c
@@ -82,6 +82,9 @@ const HWAccel hwaccels[] = {
#if CONFIG_LIBMFX
{ "qsv", qsv_init, HWACCEL_QSV, AV_PIX_FMT_QSV },
#endif
+#if CONFIG_VAAPI
+ { "vaapi", vaapi_decode_init, HWACCEL_VAAPI, AV_PIX_FMT_VAAPI },
+#endif
{ 0 },
};

diff --git a/ffmpeg_vaapi.c b/ffmpeg_vaapi.c
new file mode 100644
index 0000000..279f531
--- /dev/null
+++ b/ffmpeg_vaapi.c
@@ -0,0 +1,538 @@
+/*
+ * VAAPI helper for hardware-accelerated decoding.
+ *
+ * Copyright (C) 2016 Mark Thompson <***@jkqxz.net>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <va/va.h>
+
+#include "ffmpeg.h"
+
+#include "libavutil/avassert.h"
+#include "libavutil/avconfig.h"
+#include "libavutil/buffer.h"
+#include "libavutil/frame.h"
+#include "libavutil/imgutils.h"
+#include "libavutil/pixfmt.h"
+#include "libavutil/vaapi.h"
+
+#include "libavcodec/vaapi.h"
+
+
+static const AVClass vaapi_class = {
+ .class_name = "VAAPI",
+ .item_name = av_default_item_name,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+
+#define DEFAULT_SURFACES 20
+
+typedef struct VAAPIDecoderContext {
+ const AVClass *class;
+
+ AVVAAPIInstance va_instance;
+ AVVAAPIPipelineConfig config;
+ AVVAAPIPipelineContext codec;
+ AVVAAPISurfaceConfig output;
+
+ int codec_initialised;
+
+ AVFrame output_frame;
+
+ struct vaapi_context hwaccel_context;
+} VAAPIDecoderContext;
+
+
+static int vaapi_get_buffer(AVCodecContext *s, AVFrame *frame, int flags)
+{
+ InputStream *ist = s->opaque;
+ VAAPIDecoderContext *ctx = ist->hwaccel_ctx;
+ int err;
+
+ av_assert0(frame->format == AV_PIX_FMT_VAAPI);
+
+ av_vaapi_instance_lock(&ctx->va_instance);
+
+ err = av_vaapi_get_output_surface(&ctx->codec, frame);
+
+ av_vaapi_instance_unlock(&ctx->va_instance);
+
+ return err;
+}
+
+static int vaapi_retrieve_data(AVCodecContext *avctx, AVFrame *input_frame)
+{
+ InputStream *ist = avctx->opaque;
+ VAAPIDecoderContext *ctx = ist->hwaccel_ctx;
+ AVVAAPISurfaceConfig *output = &ctx->output;
+ AVVAAPISurface *surface;
+ AVFrame *output_frame;
+ int err, copying;
+
+ surface = (AVVAAPISurface*)input_frame->buf[0]->data;
+ av_log(ctx, AV_LOG_DEBUG, "Retrieve data from surface %#x (format %#x).\n",
+ surface->id, output->av_format);
+
+ av_vaapi_instance_lock(&ctx->va_instance);
+
+ if(output->av_format == AV_PIX_FMT_VAAPI) {
+ copying = 0;
+ av_log(ctx, AV_LOG_VERBOSE, "Surface %#x retrieved without copy.\n",
+ surface->id);
+
+ } else {
+ err = av_vaapi_map_surface(surface, 1);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to map surface %#x.",
+ surface->id);
+ goto fail;
+ }
+
+ copying = 1;
+ av_log(ctx, AV_LOG_VERBOSE, "Surface %#x mapped: image %#x data %p.\n",
+ surface->id, surface->image.image_id, surface->mapped_address);
+ }
+
+ // The actual frame need not fill the surface.
+ av_assert0(input_frame->width <= output->width);
+ av_assert0(input_frame->height <= output->height);
+
+ output_frame = &ctx->output_frame;
+ output_frame->width = input_frame->width;
+ output_frame->height = input_frame->height;
+ output_frame->format = output->av_format;
+
+ if(copying) {
+ err = av_frame_get_buffer(output_frame, 32);
+ if(err < 0) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to get output buffer: %d (%s).\n",
+ err, av_err2str(err));
+ goto fail_unmap;
+ }
+
+ err = av_vaapi_copy_from_surface(output_frame, surface);
+ if(err < 0) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to copy frame data: %d (%s).\n",
+ err, av_err2str(err));
+ goto fail_unmap;
+ }
+
+ } else {
+ // Just copy the hidden ID field.
+ output_frame->data[3] = input_frame->data[3];
+ }
+
+ err = av_frame_copy_props(output_frame, input_frame);
+ if(err < 0) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to copy frame props: %d (%s).\n",
+ err, av_err2str(err));
+ goto fail_unmap;
+ }
+
+ av_frame_unref(input_frame);
+ av_frame_move_ref(input_frame, output_frame);
+
+ fail_unmap:
+ if(copying)
+ av_vaapi_unmap_surface(surface, 0);
+ fail:
+ av_vaapi_instance_unlock(&ctx->va_instance);
+ return err;
+}
+
+static const struct {
+ VAProfile from;
+ VAProfile to;
+} vaapi_profile_compatibility[] = {
+#define FT(f, t) { VAProfile ## f, VAProfile ## t }
+ FT(MPEG2Simple, MPEG2Main ),
+ FT(H263Baseline, MPEG4AdvancedSimple),
+ FT(MPEG4Simple, MPEG4AdvancedSimple),
+ FT(MPEG4AdvancedSimple, MPEG4Main ),
+ FT(H264ConstrainedBaseline, H264Baseline),
+ FT(H264Baseline, H264Main ), // (Not quite true.)
+ FT(H264Main, H264High ),
+ FT(VC1Simple, VC1Main ),
+ FT(VC1Main, VC1Advanced ),
+#undef FT
+};
+
+static int vaapi_find_next_compatible(VAProfile profile)
+{
+ int i;
+ for(i = 0; i < FF_ARRAY_ELEMS(vaapi_profile_compatibility); i++) {
+ if(vaapi_profile_compatibility[i].from == profile)
+ return vaapi_profile_compatibility[i].to;
+ }
+ return VAProfileNone;
+}
+
+static const struct {
+ enum AVCodecID codec_id;
+ int codec_profile;
+ VAProfile va_profile;
+} vaapi_profile_map[] = {
+#define MAP(c, p, v) { AV_CODEC_ID_ ## c, FF_PROFILE_ ## p, VAProfile ## v }
+ MAP(MPEG2VIDEO, MPEG2_SIMPLE, MPEG2Simple ),
+ MAP(MPEG2VIDEO, MPEG2_MAIN, MPEG2Main ),
+ MAP(H263, UNKNOWN, H263Baseline),
+ MAP(MPEG4, MPEG4_SIMPLE, MPEG4Simple ),
+ MAP(MPEG4, MPEG4_ADVANCED_SIMPLE,
+ MPEG4AdvancedSimple),
+ MAP(MPEG4, MPEG4_MAIN, MPEG4Main ),
+ MAP(H264, H264_CONSTRAINED_BASELINE,
+ H264ConstrainedBaseline),
+ MAP(H264, H264_BASELINE, H264Baseline),
+ MAP(H264, H264_MAIN, H264Main ),
+ MAP(H264, H264_HIGH, H264High ),
+ MAP(HEVC, HEVC_MAIN, HEVCMain ),
+ MAP(WMV3, VC1_SIMPLE, VC1Simple ),
+ MAP(WMV3, VC1_MAIN, VC1Main ),
+ MAP(WMV3, VC1_COMPLEX, VC1Advanced ),
+ MAP(WMV3, VC1_ADVANCED, VC1Advanced ),
+ MAP(VC1, VC1_SIMPLE, VC1Simple ),
+ MAP(VC1, VC1_MAIN, VC1Main ),
+ MAP(VC1, VC1_COMPLEX, VC1Advanced ),
+ MAP(VC1, VC1_ADVANCED, VC1Advanced ),
+ MAP(MJPEG, UNKNOWN, JPEGBaseline),
+ MAP(VP8, UNKNOWN, VP8Version0_3),
+ MAP(VP9, VP9_0, VP9Profile0 ),
+#undef MAP
+};
+
+static VAProfile vaapi_find_profile(const AVCodecContext *avctx)
+{
+ VAProfile result = VAProfileNone;
+ int i;
+ for(i = 0; i < FF_ARRAY_ELEMS(vaapi_profile_map); i++) {
+ if(avctx->codec_id != vaapi_profile_map[i].codec_id)
+ continue;
+ result = vaapi_profile_map[i].va_profile;
+ if(avctx->profile == vaapi_profile_map[i].codec_profile)
+ break;
+ // If there isn't an exact match, we will choose the last (highest)
+ // profile in the mapping table.
+ }
+ return result;
+}
+
+static const struct {
+ enum AVPixelFormat pix_fmt;
+ unsigned int fourcc;
+} vaapi_image_formats[] = {
+ { AV_PIX_FMT_NV12, VA_FOURCC_NV12 },
+ { AV_PIX_FMT_YUV420P, VA_FOURCC_YV12 },
+};
+
+static int vaapi_get_pix_fmt(unsigned int fourcc)
+{
+ int i;
+ for(i = 0; i < FF_ARRAY_ELEMS(vaapi_image_formats); i++)
+ if(vaapi_image_formats[i].fourcc == fourcc)
+ return vaapi_image_formats[i].pix_fmt;
+ return 0;
+}
+
+static int vaapi_build_decoder_config(VAAPIDecoderContext *ctx,
+ AVVAAPIPipelineConfig *config,
+ AVVAAPISurfaceConfig *output,
+ AVCodecContext *avctx)
+{
+ VAStatus vas;
+ int i;
+
+ memset(config, 0, sizeof(*config));
+
+ // Pick codec profile to use.
+ {
+ VAProfile best_profile, profile;
+ int profile_count;
+ VAProfile *profile_list;
+
+ best_profile = vaapi_find_profile(avctx);
+ if(best_profile == VAProfileNone) {
+ av_log(ctx, AV_LOG_ERROR, "VAAPI does not support codec %s.\n",
+ avcodec_get_name(avctx->codec_id));
+ return AVERROR(EINVAL);
+ }
+
+ profile_count = vaMaxNumProfiles(ctx->va_instance.display);
+ profile_list = av_calloc(profile_count, sizeof(VAProfile));
+
+ vas = vaQueryConfigProfiles(ctx->va_instance.display,
+ profile_list, &profile_count);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to query profiles: %d (%s).\n",
+ vas, vaErrorStr(vas));
+ av_free(profile_list);
+ return AVERROR(EINVAL);
+ }
+
+ profile = best_profile;
+ while(profile != VAProfileNone) {
+ for(i = 0; i < profile_count; i++) {
+ if(profile_list[i] == profile)
+ break;
+ }
+ if(i < profile_count)
+ break;
+
+ av_log(ctx, AV_LOG_DEBUG, "Hardware does not support codec "
+ "profile: %s / %d -> VAProfile %d.\n",
+ avcodec_get_name(avctx->codec_id), avctx->profile,
+ profile);
+ profile = vaapi_find_next_compatible(profile);
+ }
+
+ av_free(profile_list);
+
+ if(profile == VAProfileNone) {
+ av_log(ctx, AV_LOG_ERROR, "Hardware does not support codec: "
+ "%s / %d.\n", avcodec_get_name(avctx->codec_id),
+ avctx->profile);
+ return AVERROR(EINVAL);
+ } else if(profile == best_profile) {
+ av_log(ctx, AV_LOG_INFO, "Hardware supports exact codec: "
+ "%s / %d -> VAProfile %d.\n",
+ avcodec_get_name(avctx->codec_id), avctx->profile,
+ profile);
+ } else {
+ av_log(ctx, AV_LOG_INFO, "Hardware supports compatible codec: "
+ "%s / %d -> VAProfile %d.\n",
+ avcodec_get_name(avctx->codec_id), avctx->profile,
+ profile);
+ }
+
+ config->profile = profile;
+ config->entrypoint = VAEntrypointVLD;
+ }
+
+ // Decide on the internal chroma format.
+ {
+ VAConfigAttrib attr;
+
+ // Currently the software only supports YUV420, so just make sure
+ // that the hardware we have does too.
+
+ memset(&attr, 0, sizeof(attr));
+ attr.type = VAConfigAttribRTFormat;
+ vas = vaGetConfigAttributes(ctx->va_instance.display, config->profile,
+ VAEntrypointVLD, &attr, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to fetch config attributes: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR(EINVAL);
+ }
+ if(!(attr.value & VA_RT_FORMAT_YUV420)) {
+ av_log(ctx, AV_LOG_ERROR, "Hardware does not support required "
+ "chroma format (%#x).\n", attr.value);
+ return AVERROR(EINVAL);
+ }
+
+ output->rt_format = VA_RT_FORMAT_YUV420;
+ }
+
+ // Decide on the image format.
+ if(avctx->pix_fmt == AV_PIX_FMT_VAAPI) {
+ // We are going to be passing through a VAAPI surface directly:
+ // they will stay as whatever opaque internal format for that time,
+ // and we never need to make VAImages from them.
+
+ av_log(ctx, AV_LOG_INFO, "Using VAAPI opaque output format.\n");
+
+ output->av_format = AV_PIX_FMT_VAAPI;
+ memset(&output->image_format, 0, sizeof(output->image_format));
+
+ } else {
+ int image_format_count;
+ VAImageFormat *image_format_list;
+ int pix_fmt;
+
+ // We might want to force a change to the output format here
+ // if we are intending to use VADeriveImage?
+
+ image_format_count = vaMaxNumImageFormats(ctx->va_instance.display);
+ image_format_list = av_calloc(image_format_count,
+ sizeof(VAImageFormat));
+
+ vas = vaQueryImageFormats(ctx->va_instance.display, image_format_list,
+ &image_format_count);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to query image formats: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR(EINVAL);
+ }
+
+ for(i = 0; i < image_format_count; i++) {
+ pix_fmt = vaapi_get_pix_fmt(image_format_list[i].fourcc);
+ if(pix_fmt == AV_PIX_FMT_NONE)
+ continue;
+ if(pix_fmt == avctx->pix_fmt)
+ break;
+ }
+ if(i < image_format_count) {
+ av_log(ctx, AV_LOG_INFO, "Using desired output format %s "
+ "(%#x).\n", av_get_pix_fmt_name(pix_fmt),
+ image_format_list[i].fourcc);
+ } else {
+ for(i = 0; i < image_format_count; i++) {
+ pix_fmt = vaapi_get_pix_fmt(image_format_list[i].fourcc);
+ if(pix_fmt != AV_PIX_FMT_NONE)
+ break;
+ }
+ if(i >= image_format_count) {
+ av_log(ctx, AV_LOG_ERROR, "No supported output format found.\n");
+ av_free(image_format_list);
+ return AVERROR(EINVAL);
+ }
+ av_log(ctx, AV_LOG_INFO, "Using alternate output format %s "
+ "(%#x).\n", av_get_pix_fmt_name(pix_fmt),
+ image_format_list[i].fourcc);
+ }
+
+ output->av_format = pix_fmt;
+ memcpy(&output->image_format, &image_format_list[i],
+ sizeof(VAImageFormat));
+
+ av_free(image_format_list);
+ }
+
+ // Decide how many reference frames we need.
+ {
+ // We should be able to do this in a more sensible way by looking
+ // at how many reference frames the input stream requires.
+ output->count = DEFAULT_SURFACES;
+ }
+
+ // Test whether the width and height are within allowable limits.
+ {
+ // Unfortunately, we need an active codec pipeline to do this properly
+ // using vaQuerySurfaceAttributes(). For now, just assume the values
+ // we got passed are ok.
+ output->width = avctx->coded_width;
+ output->height = avctx->coded_height;
+ }
+
+ return 0;
+}
+
+static int vaapi_alloc_decoder_context(VAAPIDecoderContext **ctx_ptr, const char *device)
+{
+ VAAPIDecoderContext *ctx;
+ int err;
+
+ ctx = av_mallocz(sizeof(*ctx));
+ if(!ctx)
+ return AVERROR(ENOMEM);
+
+ ctx->class = &vaapi_class;
+
+ err = av_vaapi_instance_init(&ctx->va_instance, device);
+ if(err)
+ return err;
+
+ *ctx_ptr = ctx;
+ return 0;
+}
+
+static void vaapi_decode_uninit(AVCodecContext *avctx)
+{
+ InputStream *ist = avctx->opaque;
+ VAAPIDecoderContext *ctx = ist->hwaccel_ctx;
+
+ if(ctx->codec_initialised) {
+ av_vaapi_pipeline_uninit(&ctx->codec);
+ ctx->codec_initialised = 0;
+ }
+
+ av_free(ctx);
+
+ ist->hwaccel_ctx = 0;
+ ist->hwaccel_uninit = 0;
+ ist->hwaccel_get_buffer = 0;
+ ist->hwaccel_retrieve_data = 0;
+
+ av_vaapi_instance_uninit(&ctx->va_instance);
+}
+
+int vaapi_decode_init(AVCodecContext *avctx)
+{
+ InputStream *ist = avctx->opaque;
+ VAAPIDecoderContext *ctx;
+ int err;
+
+ if(ist->hwaccel_id != HWACCEL_VAAPI)
+ return AVERROR(EINVAL);
+
+ avctx->hwaccel_context = 0;
+
+ if(ist->hwaccel_ctx) {
+ ctx = ist->hwaccel_ctx;
+ err = av_vaapi_pipeline_uninit(&ctx->codec);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Unable to reinit; failed to uninit "
+ "old codec context: %d (%s).\n", err, av_err2str(err));
+ return err;
+ }
+
+ } else {
+ err = vaapi_alloc_decoder_context(&ctx, ist->hwaccel_device);
+ if(err)
+ return err;
+ }
+
+ av_vaapi_instance_lock(&ctx->va_instance);
+
+ err = vaapi_build_decoder_config(ctx, &ctx->config, &ctx->output, avctx);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "No supported configuration for this codec.");
+ goto fail;
+ }
+
+ err = av_vaapi_pipeline_init(&ctx->codec, &ctx->va_instance,
+ &ctx->config, 0, &ctx->output);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to initialise codec context: "
+ "%d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+ ctx->codec_initialised = 1;
+
+ av_vaapi_instance_unlock(&ctx->va_instance);
+
+ av_log(ctx, AV_LOG_DEBUG, "VAAPI decoder (re)init complete.\n");
+
+ ist->hwaccel_ctx = ctx;
+ ist->hwaccel_uninit = vaapi_decode_uninit;
+ ist->hwaccel_get_buffer = vaapi_get_buffer;
+ ist->hwaccel_retrieve_data = vaapi_retrieve_data;
+
+ ctx->hwaccel_context.display = ctx->va_instance.display;
+ ctx->hwaccel_context.config_id = ctx->codec.config_id;
+ ctx->hwaccel_context.context_id = ctx->codec.context_id;
+ avctx->hwaccel_context = &ctx->hwaccel_context;
+
+ return 0;
+
+ fail:
+ av_vaapi_instance_unlock(&ctx->va_instance);
+ vaapi_decode_uninit(avctx);
+ return err;
+}
--
2.6.4
Mark Thompson
2016-01-17 22:47:43 UTC
Permalink
From 108355504c2eaa11101e1599ef7e01f18a2a8ac5 Mon Sep 17 00:00:00 2001
From: Mark Thompson <***@jkqxz.net>
Date: Sun, 17 Jan 2016 22:15:06 +0000
Subject: [PATCH 3/5] libavcodec: add VAAPI H.264 encoder

---
configure | 1 +
libavcodec/Makefile | 1 +
libavcodec/allcodecs.c | 1 +
libavcodec/vaapi_enc_h264.c | 967 ++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 970 insertions(+)
create mode 100644 libavcodec/vaapi_enc_h264.c

diff --git a/configure b/configure
index 1c77015..a31d65e 100755
--- a/configure
+++ b/configure
@@ -2499,6 +2499,7 @@ h264_mmal_encoder_deps="mmal"
h264_qsv_hwaccel_deps="libmfx"
h264_vaapi_hwaccel_deps="vaapi"
h264_vaapi_hwaccel_select="h264_decoder"
+h264_vaapi_encoder_deps="vaapi"
h264_vda_decoder_deps="vda"
h264_vda_decoder_select="h264_decoder"
h264_vda_hwaccel_deps="vda"
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index b9ffdb9..06b3c48 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -303,6 +303,7 @@ OBJS-$(CONFIG_H264_MMAL_DECODER) += mmaldec.o
OBJS-$(CONFIG_H264_VDA_DECODER) += vda_h264_dec.o
OBJS-$(CONFIG_H264_QSV_DECODER) += qsvdec_h2645.o
OBJS-$(CONFIG_H264_QSV_ENCODER) += qsvenc_h264.o
+OBJS-$(CONFIG_H264_VAAPI_ENCODER) += vaapi_enc_h264.o
OBJS-$(CONFIG_HAP_DECODER) += hapdec.o hap.o
OBJS-$(CONFIG_HAP_ENCODER) += hapenc.o hap.o
OBJS-$(CONFIG_HEVC_DECODER) += hevc.o hevc_mvs.o hevc_ps.o hevc_sei.o \
diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index 2128546..0d07087 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcodec/allcodecs.c
@@ -199,6 +199,7 @@ void avcodec_register_all(void)
#if FF_API_VDPAU
REGISTER_DECODER(H264_VDPAU, h264_vdpau);
#endif
+ REGISTER_ENCODER(H264_VAAPI, h264_vaapi);
REGISTER_ENCDEC (HAP, hap);
REGISTER_DECODER(HEVC, hevc);
REGISTER_DECODER(HEVC_QSV, hevc_qsv);
diff --git a/libavcodec/vaapi_enc_h264.c b/libavcodec/vaapi_enc_h264.c
new file mode 100644
index 0000000..0ed5dde
--- /dev/null
+++ b/libavcodec/vaapi_enc_h264.c
@@ -0,0 +1,967 @@
+/*
+ * VAAPI H.264 encoder.
+ *
+ * Copyright (C) 2016 Mark Thompson <***@jkqxz.net>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "avcodec.h"
+#include "golomb.h"
+#include "put_bits.h"
+
+#include "h264.h"
+
+#include "libavutil/opt.h"
+#include "libavutil/pixdesc.h"
+#include "libavutil/vaapi.h"
+
+#define DPB_FRAMES 16
+#define INPUT_FRAMES 2
+
+typedef struct VAAPIH264EncodeFrame {
+ AVFrame avframe;
+ VASurfaceID surface_id;
+
+ int frame_num;
+ enum {
+ FRAME_TYPE_I,
+ FRAME_TYPE_P,
+ FRAME_TYPE_B,
+ } type;
+
+ VAPictureH264 pic;
+ VAEncSliceParameterBufferH264 params;
+ VABufferID params_id;
+
+ VABufferID coded_data_id;
+
+ struct VAAPIH264EncodeFrame *refp, *refb;
+} VAAPIH264EncodeFrame;
+
+typedef struct VAAPIH264EncodeContext {
+ const AVClass *class;
+
+ AVVAAPIInstance va_instance;
+ AVVAAPIPipelineConfig va_config;
+ AVVAAPIPipelineContext va_codec;
+
+ AVVAAPISurfaceConfig input_config;
+ AVVAAPISurfaceConfig output_config;
+
+ VAProfile va_profile;
+ int level;
+ int rc_mode;
+ int width;
+ int height;
+
+ VAEncSequenceParameterBufferH264 seq_params;
+ VABufferID seq_params_id;
+
+ VAEncMiscParameterRateControl rc_params;
+ VAEncMiscParameterBuffer rc_params_buffer;
+ VABufferID rc_params_id;
+
+ VAEncPictureParameterBufferH264 pic_params;
+ VABufferID pic_params_id;
+
+ int frame_num;
+
+ VAAPIH264EncodeFrame dpb[DPB_FRAMES];
+ int current_frame;
+ int previous_frame;
+
+ struct {
+ const char *profile;
+ const char *level;
+ int qp;
+ int idr_interval;
+ } options;
+
+} VAAPIH264EncodeContext;
+
+
+static int vaapi_h264_render_packed_header(VAAPIH264EncodeContext *ctx, int type,
+ char *data, size_t bit_len)
+{
+ VAStatus vas;
+ VABufferID id_list[2];
+ VAEncPackedHeaderParameterBuffer buffer = {
+ .type = type,
+ .bit_length = bit_len,
+ .has_emulation_bytes = 0,
+ };
+
+ vas = vaCreateBuffer(ctx->va_instance.display, ctx->va_codec.context_id,
+ VAEncPackedHeaderParameterBufferType,
+ sizeof(&buffer), 1, &buffer, &id_list[0]);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create parameter buffer for packed "
+ "header (type %d): %d (%s).\n", type, vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ vas = vaCreateBuffer(ctx->va_instance.display, ctx->va_codec.context_id,
+ VAEncPackedHeaderDataBufferType,
+ (bit_len + 7) / 8, 1, data, &id_list[1]);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create data buffer for packed "
+ "header (type %d): %d (%s).\n", type, vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ vas = vaRenderPicture(ctx->va_instance.display, ctx->va_codec.context_id,
+ id_list, 2);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to render packed "
+ "header (type %d): %d (%s).\n", type, vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ return 0;
+}
+
+static void vaapi_h264_write_nal_header(PutBitContext *b, int ref, int type)
+{
+ // zero_byte
+ put_bits(b, 8, 0);
+ // start_code_prefix_one_3bytes
+ put_bits(b, 24, 1);
+ // forbidden_zero_bit
+ put_bits(b, 1, 0);
+ // nal_ref_idc
+ put_bits(b, 2, ref);
+ // nal_unit_type
+ put_bits(b, 5, type);
+}
+
+static void vaapi_h264_write_trailing_rbsp(PutBitContext *b)
+{
+ // rbsp_stop_one_bit
+ put_bits(b, 1, 1);
+ while(put_bits_count(b) & 7) {
+ // rbsp_alignment_zero_bit
+ put_bits(b, 1, 0);
+ }
+}
+
+static int vaapi_h264_render_packed_sps(VAAPIH264EncodeContext *ctx)
+{
+ PutBitContext b;
+ char tmp[256];
+ size_t len;
+
+ init_put_bits(&b, tmp, sizeof(tmp));
+
+ vaapi_h264_write_nal_header(&b, 3, NAL_SPS);
+
+ // profile_idc
+ put_bits(&b, 8, 66);
+ // constraint_set0_flag
+ put_bits(&b, 1, 0);
+ // constraint_set1_flag
+ put_bits(&b, 1, ctx->va_profile == VAProfileH264ConstrainedBaseline);
+ // constraint_set2_flag
+ put_bits(&b, 1, 0);
+ // constraint_set3_flag
+ put_bits(&b, 1, 0);
+ // constraint_set4_flag
+ put_bits(&b, 1, 0);
+ // constraint_set5_flag
+ put_bits(&b, 1, 0);
+ // reserved_zero_2bits
+ put_bits(&b, 2, 0);
+ // level_idc
+ put_bits(&b, 8, 52);
+ // seq_parameter_set_id
+ set_ue_golomb(&b, 0);
+
+ if(0) {
+ // chroma_format_idc
+ set_ue_golomb(&b, 1);
+ // bit_depth_luma_minus8
+ set_ue_golomb(&b, 0);
+ // bit_depth_chroma_minus8
+ set_ue_golomb(&b, 0);
+ // qpprime_y_zero_transform_bypass_flag
+ put_bits(&b, 1, 0);
+ // seq_scaling_matrix_present_flag
+ put_bits(&b, 1, 0);
+ }
+
+ // log2_max_frame_num_minus4
+ set_ue_golomb(&b, 4);
+ // pic_order_cnt_type
+ set_ue_golomb(&b, 2);
+
+ // max_num_ref_frames
+ set_ue_golomb(&b, 1);
+ // gaps_in_frame_num_value_allowed_flag
+ put_bits(&b, 1, 0);
+ // pic_width_in_mbs_minus1
+ set_ue_golomb(&b, (ctx->width + 15) / 16 - 1);
+ // pic_height_in_map_units_minus1
+ set_ue_golomb(&b, (ctx->height + 15) / 16 - 1);
+ // frame_mbs_oly_flag
+ put_bits(&b, 1, 1);
+
+ // direct_8x8_inference_flag
+ put_bits(&b, 1, 1);
+ // frame_cropping_flag
+ put_bits(&b, 1, 0);
+
+ // vui_parameters_present_flag
+ put_bits(&b, 1, 0);
+
+ vaapi_h264_write_trailing_rbsp(&b);
+
+ len = put_bits_count(&b);
+ flush_put_bits(&b);
+
+ return vaapi_h264_render_packed_header(ctx, VAEncPackedHeaderSequence,
+ tmp, len);
+}
+
+static int vaapi_h264_render_packed_pps(VAAPIH264EncodeContext *ctx)
+{
+ PutBitContext b;
+ char tmp[256];
+ size_t len;
+
+ init_put_bits(&b, tmp, sizeof(tmp));
+
+ vaapi_h264_write_nal_header(&b, 3, NAL_PPS);
+
+ // seq_parameter_set_id
+ set_ue_golomb(&b, 0);
+ // pic_parameter_set_id
+ set_ue_golomb(&b, 0);
+ // entropy_coding_mode_flag
+ put_bits(&b, 1, 1);
+ // bottom_field_pic_order_in_frame_present_flag
+ put_bits(&b, 1, 0);
+ // num_slice_groups_minus1
+ set_ue_golomb(&b, 0);
+
+ // num_ref_idx_l0_default_active_minus1
+ set_ue_golomb(&b, 0);
+ // num_ref_idx_l1_default_active_minus1
+ set_ue_golomb(&b, 0);
+ // weighted_pred_flag
+ put_bits(&b, 1, 0);
+ // weighted_bipred_idc
+ put_bits(&b, 2, 0);
+ // pic_init_qp_minus26
+ set_se_golomb(&b, ctx->options.qp - 26);
+ // pic_init_qs_minus26
+ set_se_golomb(&b, 0);
+ // chroma_qp_index_offset
+ set_se_golomb(&b, 0);
+ // deblocking_filter_control_present_flag
+ put_bits(&b, 1, 1);
+ // constrained_intra_pred_flag
+ put_bits(&b, 1, 0);
+ // redundant_pic_cnt_present_flag
+ put_bits(&b, 1, 0);
+
+ // transform_8x8_mode_flag
+ put_bits(&b, 1, 0);
+ // pic_scaling_matrix_present_flag
+ put_bits(&b, 1, 0);
+ // second_chroma_qp_index_offset
+ set_se_golomb(&b, 0);
+
+ vaapi_h264_write_trailing_rbsp(&b);
+
+ len = put_bits_count(&b);
+ flush_put_bits(&b);
+
+ return vaapi_h264_render_packed_header(ctx, VAEncPackedHeaderPicture,
+ tmp, len);
+}
+
+static int vaapi_h264_render_packed_slice(VAAPIH264EncodeContext *ctx,
+ VAAPIH264EncodeFrame *current)
+{
+ PutBitContext b;
+ char tmp[256];
+ size_t len;
+
+ init_put_bits(&b, tmp, sizeof(tmp));
+
+ if(current->type == FRAME_TYPE_I)
+ vaapi_h264_write_nal_header(&b, 3, NAL_IDR_SLICE);
+ else
+ vaapi_h264_write_nal_header(&b, 3, NAL_SLICE);
+
+ // first_mb_in_slice
+ set_ue_golomb(&b, 0);
+ // slice_type
+ set_ue_golomb(&b, (current->type == FRAME_TYPE_I ? 2 :
+ current->type == FRAME_TYPE_P ? 0 : 1));
+ // pic_parameter_set_id
+ set_ue_golomb(&b, 0);
+
+ // frame_num
+ put_bits(&b, 8, current->frame_num);
+
+ if(current->type == FRAME_TYPE_I) {
+ // idr_pic_id
+ set_ue_golomb(&b, 0);
+ }
+
+ // pic_order_cnt stuff
+
+ if(current->type == FRAME_TYPE_B) {
+ // direct_spatial_mv_pred_flag
+ put_bits(&b, 1, 1);
+ }
+
+ if(current->type == FRAME_TYPE_P || current->type == FRAME_TYPE_B) {
+ // num_ref_idx_active_override_flag
+ put_bits(&b, 1, 0);
+ if(0) {
+ // num_ref_idx_l0_active_minus1
+ if(current->type == FRAME_TYPE_B) {
+ // num_ref_idx_l1_active_minus1
+ }
+ }
+
+ // ref_pic_list_modification_flag_l0
+ put_bits(&b, 1, 0);
+
+ if(current->type == FRAME_TYPE_B) {
+ // ref_pic_list_modification_flag_l1
+ put_bits(&b, 1, 0);
+ }
+ }
+
+ if(1) {
+ // dec_ref_pic_marking
+ if(current->type == FRAME_TYPE_I) {
+ // no_output_of_prior_pics_flag
+ put_bits(&b, 1, 0);
+ // long_term_reference_flag
+ put_bits(&b, 1, 0);
+ } else {
+ // adaptive_pic_ref_marking_mode_flag
+ put_bits(&b, 1, 0);
+ }
+ }
+
+ if(current->type != FRAME_TYPE_I) {
+ // cabac_init_idc
+ set_ue_golomb(&b, 0);
+ }
+
+ // slice_qp_delta
+ set_se_golomb(&b, 0);
+
+ if(1) {
+ // disable_deblocking_filter_idc
+ set_ue_golomb(&b, 0);
+ // slice_alpha_c0_offset_div2
+ set_se_golomb(&b, 0);
+ // slice_beta_offset_div2
+ set_se_golomb(&b, 0);
+ }
+
+ len = put_bits_count(&b);
+ flush_put_bits(&b);
+
+ return vaapi_h264_render_packed_header(ctx, VAEncPackedHeaderSlice,
+ tmp, len);
+}
+
+static int vaapi_h264_render_sequence(VAAPIH264EncodeContext *ctx)
+{
+ VAStatus vas;
+ VAEncSequenceParameterBufferH264 *seq = &ctx->seq_params;
+
+ {
+ memset(seq, 0, sizeof(*seq));
+
+ seq->level_idc = 52;
+ seq->picture_width_in_mbs = (ctx->width + 15) / 16;
+ seq->picture_height_in_mbs = (ctx->height + 15) / 16;
+
+ seq->intra_period = 0;
+ seq->intra_idr_period = 0;
+ seq->ip_period = 1;
+
+ seq->max_num_ref_frames = 2;
+ seq->time_scale = 900;
+ seq->num_units_in_tick = 15;
+ seq->seq_fields.bits.log2_max_pic_order_cnt_lsb_minus4 = 4;
+ seq->seq_fields.bits.log2_max_frame_num_minus4 = 4;
+ seq->seq_fields.bits.frame_mbs_only_flag = 1;
+ seq->seq_fields.bits.chroma_format_idc = 1;
+ seq->seq_fields.bits.direct_8x8_inference_flag = 1;
+ seq->seq_fields.bits.pic_order_cnt_type = 2;
+
+ seq->frame_cropping_flag = 1;
+ seq->frame_crop_left_offset = 0;
+ seq->frame_crop_right_offset = 0;
+ seq->frame_crop_top_offset = 0;
+ seq->frame_crop_bottom_offset = 8;
+ }
+
+ vas = vaCreateBuffer(ctx->va_instance.display, ctx->va_codec.context_id,
+ VAEncSequenceParameterBufferType,
+ sizeof(*seq), 1, seq, &ctx->seq_params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for sequence "
+ "parameters: %d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Sequence parameter buffer is %#x.\n",
+ ctx->seq_params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display, ctx->va_codec.context_id,
+ &ctx->seq_params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to send sequence parameters: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ return 0;
+}
+
+static int vaapi_h264_render_picture(VAAPIH264EncodeContext *ctx,
+ VAAPIH264EncodeFrame *current)
+{
+ VAStatus vas;
+ VAEncPictureParameterBufferH264 *pic = &ctx->pic_params;
+ int i;
+
+ memset(pic, 0, sizeof(*pic));
+ memcpy(&pic->CurrPic, &current->pic, sizeof(VAPictureH264));
+ for(i = 0; i < FF_ARRAY_ELEMS(pic->ReferenceFrames); i++) {
+ pic->ReferenceFrames[i].picture_id = VA_INVALID_ID;
+ pic->ReferenceFrames[i].flags = VA_PICTURE_H264_INVALID;
+ }
+ if(current->type == FRAME_TYPE_P || current->type == FRAME_TYPE_B)
+ memcpy(&pic->ReferenceFrames[0], &current->refp->pic,
+ sizeof(VAPictureH264));
+ if(current->type == FRAME_TYPE_B)
+ memcpy(&pic->ReferenceFrames[1], &current->refb->pic,
+ sizeof(VAPictureH264));
+
+ pic->pic_fields.bits.idr_pic_flag = (current->type == FRAME_TYPE_I);
+ pic->pic_fields.bits.reference_pic_flag = 1;
+ pic->pic_fields.bits.entropy_coding_mode_flag = 1;
+ pic->pic_fields.bits.deblocking_filter_control_present_flag = 1;
+
+ pic->frame_num = current->frame_num;
+ pic->last_picture = 0;
+ pic->pic_init_qp = ctx->options.qp;
+
+ pic->coded_buf = current->coded_data_id;
+
+ vas = vaCreateBuffer(ctx->va_instance.display, ctx->va_codec.context_id,
+ VAEncPictureParameterBufferType,
+ sizeof(*pic), 1, pic, &ctx->pic_params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for picture "
+ "parameters: %d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Picture parameter buffer is %#x.\n",
+ ctx->pic_params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display, ctx->va_codec.context_id,
+ &ctx->pic_params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to send picture parameters: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ return 0;
+}
+
+static int vaapi_h264_render_slice(VAAPIH264EncodeContext *ctx,
+ VAAPIH264EncodeFrame *current)
+{
+ VAStatus vas;
+ VAEncSliceParameterBufferH264 *slice = &current->params;
+ int i;
+
+ {
+ memset(slice, 0, sizeof(*slice));
+
+ slice->slice_type = (current->type == FRAME_TYPE_I ? 2 :
+ current->type == FRAME_TYPE_P ? 0 : 1);
+ slice->idr_pic_id = 0;
+
+ slice->macroblock_address = 0;
+ slice->num_macroblocks = (ctx->seq_params.picture_width_in_mbs *
+ ctx->seq_params.picture_height_in_mbs);
+ slice->macroblock_info = VA_INVALID_ID;
+
+ for(i = 0; i < FF_ARRAY_ELEMS(slice->RefPicList0); i++) {
+ slice->RefPicList0[i].picture_id = VA_INVALID_SURFACE;
+ slice->RefPicList0[i].flags = VA_PICTURE_H264_INVALID;
+ }
+ for(i = 0; i < FF_ARRAY_ELEMS(slice->RefPicList1); i++) {
+ slice->RefPicList1[i].picture_id = VA_INVALID_SURFACE;
+ slice->RefPicList1[i].flags = VA_PICTURE_H264_INVALID;
+ }
+
+ if(current->refp) {
+ av_log(ctx, AV_LOG_DEBUG, "Using %#x as first reference frame.\n",
+ current->refp->pic.picture_id);
+ slice->RefPicList0[0].picture_id = current->refp->pic.picture_id;
+ slice->RefPicList0[0].flags = VA_PICTURE_H264_SHORT_TERM_REFERENCE;
+ }
+ if(current->refb) {
+ av_log(ctx, AV_LOG_DEBUG, "Using %#x as second reference frame.\n",
+ current->refb->pic.picture_id);
+ slice->RefPicList0[1].picture_id = current->refb->pic.picture_id;
+ slice->RefPicList0[1].flags = VA_PICTURE_H264_SHORT_TERM_REFERENCE;
+ }
+
+ slice->slice_qp_delta = 0;
+ slice->slice_alpha_c0_offset_div2 = 0;
+ slice->slice_beta_offset_div2 = 0;
+ slice->direct_spatial_mv_pred_flag = 1;
+ }
+
+ vas = vaCreateBuffer(ctx->va_instance.display, ctx->va_codec.context_id,
+ VAEncSliceParameterBufferType,
+ sizeof(*slice), 1, slice, &current->params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for slice "
+ "parameters: %d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Slice buffer is %#x.\n", current->params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display, ctx->va_codec.context_id,
+ &current->params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to send slice parameters: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return -1;
+ }
+
+ return 0;
+}
+
+static int vaapi_h264_encode_picture(AVCodecContext *avctx, AVPacket *pkt,
+ const AVFrame *pic, int *got_packet)
+{
+ VAAPIH264EncodeContext *ctx = avctx->priv_data;
+ AVVAAPISurface *input, *recon;
+ VAAPIH264EncodeFrame *current;
+ AVFrame *input_image, *recon_image;
+ VACodedBufferSegment *buf_list, *buf;
+ VAStatus vas;
+ int err;
+
+ av_log(ctx, AV_LOG_DEBUG, "New frame: format %s, size %ux%u.\n",
+ av_get_pix_fmt_name(pic->format), pic->width, pic->height);
+
+ av_vaapi_instance_lock(&ctx->va_instance);
+
+ if(pic->format == AV_PIX_FMT_VAAPI) {
+ input_image = 0;
+ input = (AVVAAPISurface*)pic->buf[0]->data;
+
+ } else {
+ input_image = av_frame_alloc();
+
+ err = av_vaapi_get_input_surface(&ctx->va_codec, input_image);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface to "
+ "copy input frame: %d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+
+ input = (AVVAAPISurface*)input_image->buf[0]->data;
+
+ err = av_vaapi_map_surface(input, 0);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to map input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+
+ err = av_vaapi_copy_to_surface(pic, input);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to copy to input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+
+ err = av_vaapi_unmap_surface(input, 1);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to unmap input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Using surface %#x for input image.\n",
+ input->id);
+
+ recon_image = av_frame_alloc();
+
+ err = av_vaapi_get_output_surface(&ctx->va_codec, recon_image);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface for "
+ "reconstructed frame: %d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+ recon = (AVVAAPISurface*)recon_image->buf[0]->data;
+ av_log(ctx, AV_LOG_DEBUG, "Using surface %#x for reconstructed image.\n",
+ recon->id);
+
+ if(ctx->previous_frame != ctx->current_frame) {
+ av_frame_unref(&ctx->dpb[ctx->previous_frame].avframe);
+ }
+
+ ctx->previous_frame = ctx->current_frame;
+ ctx->current_frame = (ctx->current_frame + 1) % DPB_FRAMES;
+ {
+ current = &ctx->dpb[ctx->current_frame];
+
+ if(ctx->frame_num < 0 ||
+ ctx->frame_num == ctx->options.idr_interval)
+ current->type = FRAME_TYPE_I;
+ else
+ current->type = FRAME_TYPE_P;
+
+ if(current->type == FRAME_TYPE_I)
+ ctx->frame_num = 0;
+ else
+ ++ctx->frame_num;
+ current->frame_num = ctx->frame_num;
+
+ if(current->type == FRAME_TYPE_I) {
+ current->refp = 0;
+ current->refb = 0;
+ } else if(current->type == FRAME_TYPE_P) {
+ current->refp = &ctx->dpb[ctx->previous_frame];
+ current->refb = 0;
+ } else {
+ av_assert0(0);
+ }
+
+ memset(&current->pic, 0, sizeof(VAPictureH264));
+ current->pic.picture_id = recon->id;
+ current->pic.frame_idx = ctx->frame_num;
+
+ memcpy(&current->avframe, recon_image, sizeof(AVFrame));
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Encoding as frame as %s (%d).\n",
+ current->type == FRAME_TYPE_I ? "I" :
+ current->type == FRAME_TYPE_P ? "P" : "B", ctx->frame_num);
+
+ vas = vaBeginPicture(ctx->va_instance.display, ctx->va_codec.context_id,
+ input->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to attach new picture: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ if(current->type == FRAME_TYPE_I) {
+ err = vaapi_h264_render_sequence(ctx);
+ if(err) goto fail;
+ }
+
+ err = vaapi_h264_render_picture(ctx, current);
+ if(err) goto fail;
+
+ if(current->type == FRAME_TYPE_I) {
+ err = vaapi_h264_render_packed_sps(ctx);
+ if(err) goto fail;
+
+ err = vaapi_h264_render_packed_pps(ctx);
+ if(err) goto fail;
+ }
+
+ err = vaapi_h264_render_packed_slice(ctx, current);
+ if(err) goto fail;
+
+ err = vaapi_h264_render_slice(ctx, current);
+ if(err) goto fail;
+
+ vas = vaEndPicture(ctx->va_instance.display, ctx->va_codec.context_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to start picture processing: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ vas = vaSyncSurface(ctx->va_instance.display, input->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to sync to picture completion: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ buf_list = 0;
+ vas = vaMapBuffer(ctx->va_instance.display, current->coded_data_id,
+ (void**)&buf_list);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to map output buffers: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ for(buf = buf_list; buf; buf = buf->next) {
+ av_log(ctx, AV_LOG_DEBUG, "Output buffer: %u bytes.\n", buf->size);
+ err = av_new_packet(pkt, buf->size);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to make output buffer "
+ "(%u bytes).\n", buf->size);
+ goto fail;
+ }
+
+ memcpy(pkt->data, buf->buf, buf->size);
+
+ if(current->type == FRAME_TYPE_I)
+ pkt->flags |= AV_PKT_FLAG_KEY;
+
+ *got_packet = 1;
+ }
+
+ vas = vaUnmapBuffer(ctx->va_instance.display, current->coded_data_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to unmap output buffers: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ if(pic->format != AV_PIX_FMT_VAAPI)
+ av_frame_free(&input_image);
+
+ err = 0;
+ fail:
+ av_vaapi_instance_unlock(&ctx->va_instance);
+ return err;
+}
+
+static VAConfigAttrib config_attributes[] = {
+ { .type = VAConfigAttribRTFormat,
+ .value = VA_RT_FORMAT_YUV420 },
+ { .type = VAConfigAttribRateControl,
+ .value = VA_RC_CQP },
+ { .type = VAConfigAttribEncPackedHeaders,
+ .value = 0 },
+};
+
+static av_cold int vaapi_h264_encode_init(AVCodecContext *avctx)
+{
+ VAAPIH264EncodeContext *ctx = avctx->priv_data;
+ VAStatus vas;
+ int i, err;
+
+ if(strcmp(ctx->options.profile, "constrained_baseline"))
+ ctx->va_profile = VAProfileH264ConstrainedBaseline;
+ else if(strcmp(ctx->options.profile, "baseline"))
+ ctx->va_profile = VAProfileH264Baseline;
+ else if(strcmp(ctx->options.profile, "main"))
+ ctx->va_profile = VAProfileH264Main;
+ else if(strcmp(ctx->options.profile, "high"))
+ ctx->va_profile = VAProfileH264High;
+ else {
+ av_log(ctx, AV_LOG_ERROR, "Invalid profile '%s'.\n",
+ ctx->options.profile);
+ return AVERROR(EINVAL);
+ }
+
+ ctx->level = -1;
+ if(sscanf(ctx->options.level, "%d", &ctx->level) <= 0 ||
+ ctx->level < 0 || ctx->level > 52) {
+ av_log(ctx, AV_LOG_ERROR, "Invaid level '%s'.\n", ctx->options.level);
+ return AVERROR(EINVAL);
+ }
+
+ if(ctx->options.qp >= 0) {
+ ctx->rc_mode = VA_RC_CQP;
+ } else {
+ // Default to CQP 26.
+ ctx->rc_mode = VA_RC_CQP;
+ ctx->options.qp = 26;
+ }
+ av_log(ctx, AV_LOG_INFO, "Using constant-QP mode at %d.\n",
+ ctx->options.qp);
+
+ err = av_vaapi_instance_init(&ctx->va_instance, 0);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "No VAAPI instance.\n");
+ return err;
+ }
+
+ ctx->width = avctx->width;
+ ctx->height = avctx->height;
+
+ ctx->frame_num = -1;
+
+ {
+ AVVAAPIPipelineConfig *config = &ctx->va_config;
+
+ config->profile = ctx->va_profile;
+ config->entrypoint = VAEntrypointEncSlice;
+
+ config->attribute_count = FF_ARRAY_ELEMS(config_attributes);
+ config->attributes = config_attributes;
+ }
+
+ {
+ AVVAAPISurfaceConfig *config = &ctx->output_config;
+
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->av_format = AV_PIX_FMT_VAAPI;
+
+ config->image_format.fourcc = VA_FOURCC_NV12;
+ config->image_format.bits_per_pixel = 12;
+
+ config->count = DPB_FRAMES;
+ config->width = ctx->width;
+ config->height = ctx->height;
+
+ config->attribute_count = 0;
+ }
+
+ {
+ AVVAAPISurfaceConfig *config = &ctx->input_config;
+
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->av_format = AV_PIX_FMT_VAAPI;
+
+ config->image_format.fourcc = VA_FOURCC_NV12;
+ config->image_format.bits_per_pixel = 12;
+
+ config->count = INPUT_FRAMES;
+ config->width = ctx->width;
+ config->height = ctx->height;
+
+ config->attribute_count = 0;
+ }
+
+ av_vaapi_instance_lock(&ctx->va_instance);
+
+ err = av_vaapi_pipeline_init(&ctx->va_codec, &ctx->va_instance,
+ &ctx->va_config,
+ &ctx->input_config, &ctx->output_config);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create codec: %d (%s).\n",
+ err, av_err2str(err));
+ goto fail;
+ }
+
+ for(i = 0; i < DPB_FRAMES; i++) {
+ vas = vaCreateBuffer(ctx->va_instance.display,
+ ctx->va_codec.context_id,
+ VAEncCodedBufferType,
+ 1048576, 1, 0, &ctx->dpb[i].coded_data_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for "
+ "coded data: %d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+ av_log(ctx, AV_LOG_TRACE, "Coded data buffer %d is %#x.\n",
+ i, ctx->dpb[i].coded_data_id);
+ }
+
+ av_vaapi_instance_unlock(&ctx->va_instance);
+
+ av_log(ctx, AV_LOG_INFO, "Started VAAPI H.264 encoder.\n");
+ return 0;
+
+ fail:
+ av_vaapi_instance_unlock(&ctx->va_instance);
+ return err;
+}
+
+static av_cold int vaapi_h264_encode_close(AVCodecContext *avctx)
+{
+ VAAPIH264EncodeContext *ctx = avctx->priv_data;
+ int err;
+
+ av_vaapi_instance_lock(&ctx->va_instance);
+
+ err = av_vaapi_pipeline_uninit(&ctx->va_codec);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to destroy codec: %d (%s).\n",
+ err, av_err2str(err));
+ }
+
+ av_vaapi_instance_unlock(&ctx->va_instance);
+
+ err = av_vaapi_instance_uninit(&ctx->va_instance);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to uninitialised VAAPI "
+ "instance: %d (%s).\n",
+ err, av_err2str(err));
+ }
+
+ return 0;
+}
+
+#define OFFSET(member) offsetof(VAAPIH264EncodeContext, options.member)
+#define FLAGS (AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_ENCODING_PARAM)
+static const AVOption vaapi_h264_options[] = {
+ { "profile", "Set H.264 profile",
+ OFFSET(profile), AV_OPT_TYPE_STRING,
+ { .str = "baseline" }, 0, 0, FLAGS },
+ { "level", "Set H.264 level",
+ OFFSET(level), AV_OPT_TYPE_STRING,
+ { .str = "52" }, 0, 0, FLAGS },
+ { "qp", "Use constant quantisation parameter",
+ OFFSET(qp), AV_OPT_TYPE_INT,
+ { .i64 = -1 }, -1, 52, FLAGS },
+ { "idr_interval", "Number of frames between IDR frames (0 = all intra)",
+ OFFSET(idr_interval), AV_OPT_TYPE_INT,
+ { .i64 = -1 }, -1, INT_MAX, FLAGS },
+ { 0 }
+};
+
+static const AVClass vaapi_h264_class = {
+ .class_name = "VAAPI/H.264",
+ .item_name = av_default_item_name,
+ .option = vaapi_h264_options,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+AVCodec ff_h264_vaapi_encoder = {
+ .name = "vaapi_h264",
+ .long_name = NULL_IF_CONFIG_SMALL("H.264 (VAAPI)"),
+ .type = AVMEDIA_TYPE_VIDEO,
+ .id = AV_CODEC_ID_H264,
+ .priv_data_size = sizeof(VAAPIH264EncodeContext),
+ .init = &vaapi_h264_encode_init,
+ .encode2 = &vaapi_h264_encode_picture,
+ .close = &vaapi_h264_encode_close,
+ .priv_class = &vaapi_h264_class,
+ .pix_fmts = (const enum AVPixelFormat[]) {
+ AV_PIX_FMT_VAAPI,
+ AV_PIX_FMT_NV12,
+ AV_PIX_FMT_NONE,
+ },
+};
--
2.6.4
Mark Thompson
2016-01-17 22:48:30 UTC
Permalink
From 7f4e62e0682786681fef1c6f52429fea6443a843 Mon Sep 17 00:00:00 2001
From: Mark Thompson <***@jkqxz.net>
Date: Sun, 17 Jan 2016 22:15:49 +0000
Subject: [PATCH 4/5] libavcodec: add VAAPI H.265 encoder

---
configure | 1 +
libavcodec/Makefile | 1 +
libavcodec/allcodecs.c | 1 +
libavcodec/vaapi_enc_hevc.c | 1626 +++++++++++++++++++++++++++++++++++++++++++
4 files changed, 1629 insertions(+)
create mode 100644 libavcodec/vaapi_enc_hevc.c

diff --git a/configure b/configure
index a31d65e..9da8e8b 100755
--- a/configure
+++ b/configure
@@ -2519,6 +2519,7 @@ hevc_dxva2_hwaccel_select="hevc_decoder"
hevc_qsv_hwaccel_deps="libmfx"
hevc_vaapi_hwaccel_deps="vaapi VAPictureParameterBufferHEVC"
hevc_vaapi_hwaccel_select="hevc_decoder"
+hevc_vaapi_encoder_deps="vaapi"
hevc_vdpau_hwaccel_deps="vdpau VdpPictureInfoHEVC"
hevc_vdpau_hwaccel_select="hevc_decoder"
mpeg_vdpau_decoder_deps="vdpau"
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index 06b3c48..a5e1cab 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -311,6 +311,7 @@ OBJS-$(CONFIG_HEVC_DECODER) += hevc.o hevc_mvs.o hevc_ps.o hevc_sei.o
hevcdsp.o hevc_filter.o hevc_parse.o hevc_data.o
OBJS-$(CONFIG_HEVC_QSV_DECODER) += qsvdec_h2645.o
OBJS-$(CONFIG_HEVC_QSV_ENCODER) += qsvenc_hevc.o hevc_ps_enc.o hevc_parse.o
+OBJS-$(CONFIG_HEVC_VAAPI_ENCODER) += vaapi_enc_hevc.o
OBJS-$(CONFIG_HNM4_VIDEO_DECODER) += hnm4video.o
OBJS-$(CONFIG_HQ_HQA_DECODER) += hq_hqa.o hq_hqadata.o hq_hqadsp.o \
canopus.o
diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index 0d07087..a25da5b 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcodec/allcodecs.c
@@ -203,6 +203,7 @@ void avcodec_register_all(void)
REGISTER_ENCDEC (HAP, hap);
REGISTER_DECODER(HEVC, hevc);
REGISTER_DECODER(HEVC_QSV, hevc_qsv);
+ REGISTER_ENCODER(HEVC_VAAPI, hevc_vaapi);
REGISTER_DECODER(HNM4_VIDEO, hnm4_video);
REGISTER_DECODER(HQ_HQA, hq_hqa);
REGISTER_DECODER(HQX, hqx);
diff --git a/libavcodec/vaapi_enc_hevc.c b/libavcodec/vaapi_enc_hevc.c
new file mode 100644
index 0000000..21a2867
--- /dev/null
+++ b/libavcodec/vaapi_enc_hevc.c
@@ -0,0 +1,1626 @@
+/*
+ * VAAPI H.265 encoder.
+ *
+ * Copyright (C) 2016 Mark Thompson <***@jkqxz.net>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "avcodec.h"
+#include "golomb.h"
+#include "put_bits.h"
+
+#include "hevc.h"
+
+#include "libavutil/opt.h"
+#include "libavutil/pixdesc.h"
+#include "libavutil/vaapi.h"
+
+#define MAX_DPB_PICS 16
+#define INPUT_PICS 2
+
+#define bool unsigned char
+#define MAX_ST_REF_PIC_SETS 32
+#define MAX_LAYERS 1
+
+
+// This structure contains all possibly-useful per-sequence syntax elements
+// which are not already contained in the various VAAPI structures.
+typedef struct VAAPIHEVCEncodeMiscSequenceParams {
+
+ // Parameter set IDs.
+ unsigned int video_parameter_set_id;
+ unsigned int seq_parameter_set_id;
+
+ // Layering.
+ unsigned int vps_max_layers_minus1;
+ unsigned int vps_max_sub_layers_minus1;
+ bool vps_temporal_id_nesting_flag;
+ unsigned int vps_max_layer_id;
+ unsigned int vps_num_layer_sets_minus1;
+ unsigned int sps_max_sub_layers_minus1;
+ bool sps_temporal_id_nesting_flag;
+ bool layer_id_included_flag[MAX_LAYERS][64];
+
+ // Profile/tier/level parameters.
+ bool general_profile_compatibility_flag[32];
+ bool general_progressive_source_flag;
+ bool general_interlaced_source_flag;
+ bool general_non_packed_constraint_flag;
+ bool general_frame_only_constraint_flag;
+ bool general_inbld_flag;
+
+ // Decode/display ordering parameters.
+ unsigned int log2_max_pic_order_cnt_lsb_minus4;
+ bool vps_sub_layer_ordering_info_present_flag;
+ unsigned int vps_max_dec_pic_buffering_minus1[MAX_LAYERS];
+ unsigned int vps_max_num_reorder_pics[MAX_LAYERS];
+ unsigned int vps_max_latency_increase_plus1[MAX_LAYERS];
+ bool sps_sub_layer_ordering_info_present_flag;
+ unsigned int sps_max_dec_pic_buffering_minus1[MAX_LAYERS];
+ unsigned int sps_max_num_reorder_pics[MAX_LAYERS];
+ unsigned int sps_max_latency_increase_plus1[MAX_LAYERS];
+
+ // Timing information.
+ bool vps_timing_info_present_flag;
+ unsigned int vps_num_units_in_tick;
+ unsigned int vps_time_scale;
+ bool vps_poc_proportional_to_timing_flag;
+ unsigned int vps_num_ticks_poc_diff_minus1;
+
+ // Cropping information.
+ bool conformance_window_flag;
+ unsigned int conf_win_left_offset;
+ unsigned int conf_win_right_offset;
+ unsigned int conf_win_top_offset;
+ unsigned int conf_win_bottom_offset;
+
+ // Short-term reference picture sets.
+ unsigned int num_short_term_ref_pic_sets;
+ struct {
+ unsigned int num_negative_pics;
+ unsigned int num_positive_pics;
+
+ unsigned int delta_poc_s0_minus1[MAX_DPB_PICS];
+ bool used_by_curr_pic_s0_flag[MAX_DPB_PICS];
+
+ unsigned int delta_poc_s1_minus1[MAX_DPB_PICS];
+ bool used_by_curr_pic_s1_flag[MAX_DPB_PICS];
+ } st_ref_pic_set[MAX_ST_REF_PIC_SETS];
+
+ // Long-term reference pictures.
+ bool long_term_ref_pics_present_flag;
+ unsigned int num_long_term_ref_pics_sps;
+ struct {
+ unsigned int lt_ref_pic_poc_lsb_sps;
+ bool used_by_curr_pic_lt_sps_flag;
+ } lt_ref_pic;
+
+ // Deblocking filter control.
+ bool deblocking_filter_control_present_flag;
+ bool deblocking_filter_override_enabled_flag;
+ bool pps_deblocking_filter_disabled_flag;
+ int pps_beta_offset_div2;
+ int pps_tc_offset_div2;
+
+ // Video Usability Information.
+ bool vui_parameters_present_flag;
+ bool aspect_ratio_info_present_flag;
+ unsigned int aspect_ratio_idc;
+ unsigned int sar_width;
+ unsigned int sar_height;
+ bool video_signal_type_present_flag;
+ unsigned int video_format;
+ bool video_full_range_flag;
+ bool colour_description_present_flag;
+ unsigned int colour_primaries;
+ unsigned int transfer_characteristics;
+ unsigned int matrix_coeffs;
+
+ // Oddments.
+ bool uniform_spacing_flag;
+ bool output_flag_present_flag;
+ bool cabac_init_present_flag;
+ unsigned int num_extra_slice_header_bits;
+ bool lists_modification_present_flag;
+ bool pps_slice_chroma_qp_offsets_present_flag;
+ bool pps_slice_chroma_offset_list_enabled_flag;
+
+} VAAPIHEVCEncodeMiscSequenceParams;
+
+// This structure contains all possibly-useful per-slice syntax elements
+// which are not already contained in the various VAAPI structures.
+typedef struct {
+ // Slice segments.
+ bool first_slice_segment_in_pic_flag;
+ unsigned int slice_segment_address;
+
+ // Short-term reference picture sets.
+ bool short_term_ref_pic_set_sps_flag;
+ unsigned int short_term_ref_pic_idx;
+
+ // Deblocking filter.
+ bool deblocking_filter_override_flag;
+
+ // Oddments.
+ bool slice_reserved_flag[8];
+ bool no_output_of_prior_pics_flag;
+ bool pic_output_flag;
+
+} VAAPIHEVCEncodeMiscPictureParams;
+
+typedef struct VAAPIHEVCEncodeFrame {
+ AVFrame avframe;
+ VASurfaceID surface_id;
+
+ int poc;
+ enum {
+ FRAME_TYPE_I = I_SLICE,
+ FRAME_TYPE_P = P_SLICE,
+ FRAME_TYPE_B = B_SLICE,
+ } type;
+
+ VAPictureHEVC pic;
+
+ VAEncPictureParameterBufferHEVC pic_params;
+ VABufferID pic_params_id;
+
+ VAEncSliceParameterBufferHEVC slice_params;
+ VABufferID slice_params_id;
+
+ VAAPIHEVCEncodeMiscPictureParams misc_params;
+
+ VABufferID coded_data_id;
+
+ struct VAAPIHEVCEncodeFrame *refa, *refb;
+} VAAPIHEVCEncodeFrame;
+
+typedef struct VAAPIHEVCEncodeContext {
+ const AVClass *class;
+ const AVCodecContext *avctx;
+
+ AVVAAPIInstance va_instance;
+ AVVAAPIPipelineConfig va_config;
+ AVVAAPIPipelineContext va_codec;
+
+ int input_is_vaapi;
+ AVVAAPISurfaceConfig input_config;
+ AVVAAPISurfaceConfig output_config;
+
+ VAProfile va_profile;
+ int level;
+ int rc_mode;
+ int fixed_qp;
+
+ int input_width;
+ int input_height;
+
+ int aligned_width;
+ int aligned_height;
+ int ctu_width;
+ int ctu_height;
+
+ VAEncSequenceParameterBufferHEVC seq_params;
+ VABufferID seq_params_id;
+
+ VAEncMiscParameterRateControl rc_params;
+ VAEncMiscParameterBuffer rc_params_buffer;
+ VABufferID rc_params_id;
+
+ VAEncPictureParameterBufferHEVC pic_params;
+ VABufferID pic_params_id;
+
+ VAAPIHEVCEncodeMiscSequenceParams misc_params;
+
+ int poc;
+
+ VAAPIHEVCEncodeFrame dpb[MAX_DPB_PICS];
+ int current_frame;
+ int previous_frame;
+
+ struct {
+ const char *profile;
+ const char *level;
+ int qp;
+ int idr_interval;
+ } options;
+
+} VAAPIHEVCEncodeContext;
+
+
+// Set to 1 to log a full trace of all bitstream output (debugging only).
+#if 0
+static void trace_hevc_write_u(PutBitContext *s, unsigned int width,
+ unsigned int value, const char *name)
+{
+ av_log(0, AV_LOG_INFO, "H.265 bitstream [%3d]: %4u u(%u) / %s\n",
+ put_bits_count(s), value, width, name);
+ put_bits(s, width, value);
+}
+static void trace_hevc_write_ue(PutBitContext *s,
+ unsigned int value, const char *name)
+{
+ av_log(0, AV_LOG_INFO, "H.265 bitstream [%3d]: %4u ue(v) / %s\n",
+ put_bits_count(s), value, name);
+ set_ue_golomb(s, value);
+}
+static void trace_hevc_write_se(PutBitContext *s,
+ int value, const char *name)
+{
+ av_log(0, AV_LOG_INFO, "H.265 bitstream [%3d]: %+4d se(v) / %s\n",
+ put_bits_count(s), value, name);
+ set_se_golomb(s, value);
+}
+
+#define hevc_write_u(pbc, width, value, name) \
+ trace_hevc_write_u(pbc, width, value, #name)
+#define hevc_write_ue(pbc, value, name) \
+ trace_hevc_write_ue(pbc, value, #name)
+#define hevc_write_se(pbc, value, name) \
+ trace_hevc_write_se(pbc, value, #name)
+#else
+#define hevc_write_u(pbc, width, value, name) put_bits(pbc, width, value)
+#define hevc_write_ue(pbc, value, name) set_ue_golomb(pbc, value)
+#define hevc_write_se(pbc, value, name) set_se_golomb(pbc, value)
+#endif
+
+#define u(width, ...) hevc_write_u(s, width, __VA_ARGS__)
+#define ue(...) hevc_write_ue(s, __VA_ARGS__)
+#define se(...) hevc_write_se(s, __VA_ARGS__)
+
+#define seq_var(name) seq->name, name
+#define seq_field(name) seq->seq_fields.bits.name, name
+#define pic_var(name) pic->name, name
+#define pic_field(name) pic->pic_fields.bits.name, name
+#define slice_var(name) slice->name, name
+#define slice_field(name) slice->slice_fields.bits.name, name
+#define misc_var(name) misc->name, name
+#define miscs_var(name) miscs->name, name
+
+static void vaapi_hevc_write_nal_unit_header(PutBitContext *s,
+ int nal_unit_type)
+{
+ u(1, 0, forbidden_zero_bit);
+ u(6, nal_unit_type, nal_unit_type);
+ u(6, 0, nuh_layer_id);
+ u(3, 1, nuh_temporal_id_plus1);
+}
+
+static void vaapi_hevc_write_rbsp_trailing_bits(PutBitContext *s)
+{
+ u(1, 1, rbsp_stop_one_bit);
+ while(put_bits_count(s) & 7)
+ u(1, 0, rbsp_alignment_zero_bit);
+}
+
+static void vaapi_hevc_write_profile_tier_level(PutBitContext *s,
+ VAAPIHEVCEncodeContext *ctx)
+{
+ VAEncSequenceParameterBufferHEVC *seq = &ctx->seq_params;
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+ int j;
+
+ if(1) {
+ u(2, 0, general_profile_space);
+ u(1, seq->general_tier_flag, general_tier_flag);
+ u(5, seq->general_profile_idc, general_profile_idc);
+
+ for(j = 0; j < 32; j++) {
+ u(1, misc_var(general_profile_compatibility_flag[j]));
+ }
+
+ u(1, misc_var(general_progressive_source_flag));
+ u(1, misc_var(general_interlaced_source_flag));
+ u(1, misc_var(general_non_packed_constraint_flag));
+ u(1, misc_var(general_frame_only_constraint_flag));
+
+ if(0) {
+ // Not main profile.
+ // Lots of extra constraint flags.
+ } else {
+ // put_bits only handles up to 31 bits.
+ u(23, 0, general_reserved_zero_43bits);
+ u(20, 0, general_reserved_zero_43bits);
+ }
+
+ if(seq->general_profile_idc >= 1 && seq->general_profile_idc <= 5) {
+ u(1, misc_var(general_inbld_flag));
+ } else {
+ u(1, 0, general_reserved_zero_bit);
+ }
+ }
+
+ u(8, seq->general_level_idc, general_level_idc);
+
+ // No sublayers.
+}
+
+static void vaapi_hevc_write_vps(PutBitContext *s,
+ VAAPIHEVCEncodeContext *ctx)
+{
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+ int i, j;
+
+ vaapi_hevc_write_nal_unit_header(s, NAL_VPS);
+
+ u(4, misc->video_parameter_set_id, vps_video_parameter_set_id);
+
+ u(1, 1, vps_base_layer_internal_flag);
+ u(1, 1, vps_base_layer_available_flag);
+ u(6, misc_var(vps_max_layers_minus1));
+ u(3, misc_var(vps_max_sub_layers_minus1));
+ u(1, misc_var(vps_temporal_id_nesting_flag));
+
+ u(16, 0xffff, vps_reserved_0xffff_16bits);
+
+ vaapi_hevc_write_profile_tier_level(s, ctx);
+
+ u(1, misc_var(vps_sub_layer_ordering_info_present_flag));
+ for(i = (misc->vps_sub_layer_ordering_info_present_flag ?
+ 0 : misc->vps_max_sub_layers_minus1);
+ i <= misc->vps_max_sub_layers_minus1; i++) {
+ ue(misc_var(vps_max_dec_pic_buffering_minus1[i]));
+ ue(misc_var(vps_max_num_reorder_pics[i]));
+ ue(misc_var(vps_max_latency_increase_plus1[i]));
+ }
+
+ u(6, misc_var(vps_max_layer_id));
+ ue(misc_var(vps_num_layer_sets_minus1));
+ for(i = 1; i <= misc->vps_num_layer_sets_minus1; i++) {
+ for(j = 0; j < misc->vps_max_layer_id; j++)
+ u(1, misc_var(layer_id_included_flag[i][j]));
+ }
+
+ u(1, misc_var(vps_timing_info_present_flag));
+ if(misc->vps_timing_info_present_flag) {
+ u(1, 0, put_bits_hack_zero_bit);
+ u(31, misc_var(vps_num_units_in_tick));
+ u(1, 0, put_bits_hack_zero_bit);
+ u(31, misc_var(vps_time_scale));
+ u(1, misc_var(vps_poc_proportional_to_timing_flag));
+ if(misc->vps_poc_proportional_to_timing_flag) {
+ ue(misc_var(vps_num_ticks_poc_diff_minus1));
+ }
+ ue(0, vps_num_hrd_parameters);
+ }
+
+ u(1, 0, vps_extension_flag);
+
+ vaapi_hevc_write_rbsp_trailing_bits(s);
+}
+
+static void vaapi_hevc_write_st_ref_pic_set(PutBitContext *s,
+ VAAPIHEVCEncodeContext *ctx,
+ int st_rps_idx)
+{
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+#define strps_var(name) misc->st_ref_pic_set[st_rps_idx].name, name
+ int i;
+
+ if(st_rps_idx != 0)
+ u(1, 0, inter_ref_pic_set_prediction_flag);
+
+ if(0) {
+ // Inter ref pic set prediction.
+ } else {
+ ue(strps_var(num_negative_pics));
+ ue(strps_var(num_positive_pics));
+
+ for(i = 0; i <
+ misc->st_ref_pic_set[st_rps_idx].num_negative_pics; i++) {
+ ue(strps_var(delta_poc_s0_minus1[i]));
+ u(1, strps_var(used_by_curr_pic_s0_flag[i]));
+ }
+ for(i = 0; i <
+ misc->st_ref_pic_set[st_rps_idx].num_positive_pics; i++) {
+ ue(strps_var(delta_poc_s1_minus1[i]));
+ u(1, strps_var(used_by_curr_pic_s1_flag[i]));
+ }
+ }
+}
+
+static void vaapi_hevc_write_vui_parameters(PutBitContext *s,
+ VAAPIHEVCEncodeContext *ctx)
+{
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+
+ u(1, misc_var(aspect_ratio_info_present_flag));
+ if(misc->aspect_ratio_info_present_flag) {
+ u(8, misc_var(aspect_ratio_idc));
+ if(misc->aspect_ratio_idc == 255) {
+ u(16, misc_var(sar_width));
+ u(16, misc_var(sar_height));
+ }
+ }
+
+ u(1, 0, overscan_info_present_flag);
+
+ u(1, misc_var(video_signal_type_present_flag));
+ if(misc->video_signal_type_present_flag) {
+ u(3, misc_var(video_format));
+ u(1, misc_var(video_full_range_flag));
+ u(1, misc_var(colour_description_present_flag));
+ if(misc->colour_description_present_flag) {
+ u(8, misc_var(colour_primaries));
+ u(8, misc_var(transfer_characteristics));
+ u(8, misc_var(matrix_coeffs));
+ }
+ }
+
+ u(1, 0, chroma_loc_info_present_flag);
+ u(1, 0, neutral_chroma_indication_flag);
+ u(1, 0, field_seq_flag);
+ u(1, 0, frame_field_info_present_flag);
+ u(1, 0, default_display_window_flag);
+ u(1, 0, vui_timing_info_present_flag);
+ u(1, 0, bitstream_restriction_flag_flag);
+}
+
+static void vaapi_hevc_write_sps(PutBitContext *s,
+ VAAPIHEVCEncodeContext *ctx)
+{
+ VAEncSequenceParameterBufferHEVC *seq = &ctx->seq_params;
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+ int i;
+
+ vaapi_hevc_write_nal_unit_header(s, NAL_SPS);
+
+ u(4, misc->video_parameter_set_id, sps_video_parameter_set_id);
+
+ u(3, misc_var(sps_max_sub_layers_minus1));
+ u(1, misc_var(sps_temporal_id_nesting_flag));
+
+ vaapi_hevc_write_profile_tier_level(s, ctx);
+
+ ue(misc->seq_parameter_set_id, sps_seq_parameter_set_id);
+ ue(seq_field(chroma_format_idc));
+ if(seq->seq_fields.bits.chroma_format_idc == 3)
+ u(1, 0, separate_colour_plane_flag);
+
+ ue(seq_var(pic_width_in_luma_samples));
+ ue(seq_var(pic_height_in_luma_samples));
+
+ u(1, misc_var(conformance_window_flag));
+ if(misc->conformance_window_flag) {
+ ue(misc_var(conf_win_left_offset));
+ ue(misc_var(conf_win_right_offset));
+ ue(misc_var(conf_win_top_offset));
+ ue(misc_var(conf_win_bottom_offset));
+ }
+
+ ue(seq_field(bit_depth_luma_minus8));
+ ue(seq_field(bit_depth_chroma_minus8));
+
+ ue(misc_var(log2_max_pic_order_cnt_lsb_minus4));
+
+ u(1, misc_var(sps_sub_layer_ordering_info_present_flag));
+ for(i = (misc->sps_sub_layer_ordering_info_present_flag ?
+ 0 : misc->sps_max_sub_layers_minus1);
+ i <= misc->sps_max_sub_layers_minus1; i++) {
+ ue(misc_var(sps_max_dec_pic_buffering_minus1[i]));
+ ue(misc_var(sps_max_num_reorder_pics[i]));
+ ue(misc_var(sps_max_latency_increase_plus1[i]));
+ }
+
+ ue(seq_var(log2_min_luma_coding_block_size_minus3));
+ ue(seq_var(log2_diff_max_min_luma_coding_block_size));
+ ue(seq_var(log2_min_transform_block_size_minus2));
+ ue(seq_var(log2_diff_max_min_transform_block_size));
+ ue(seq_var(max_transform_hierarchy_depth_inter));
+ ue(seq_var(max_transform_hierarchy_depth_intra));
+
+ u(1, seq_field(scaling_list_enabled_flag));
+ if(seq->seq_fields.bits.scaling_list_enabled_flag) {
+ u(1, 0, sps_scaling_list_data_present_flag);
+ }
+
+ u(1, seq_field(amp_enabled_flag));
+ u(1, seq_field(sample_adaptive_offset_enabled_flag));
+
+ u(1, seq_field(pcm_enabled_flag));
+ if(seq->seq_fields.bits.pcm_enabled_flag) {
+ u(4, seq_var(pcm_sample_bit_depth_luma_minus1));
+ u(4, seq_var(pcm_sample_bit_depth_chroma_minus1));
+ ue(seq_var(log2_min_pcm_luma_coding_block_size_minus3));
+ ue(seq->log2_max_pcm_luma_coding_block_size_minus3 -
+ seq->log2_min_pcm_luma_coding_block_size_minus3,
+ log2_diff_max_min_pcm_luma_coding_block_size);
+ u(1, seq_field(pcm_loop_filter_disabled_flag));
+ }
+
+ ue(misc_var(num_short_term_ref_pic_sets));
+ for(i = 0; i < misc->num_short_term_ref_pic_sets; i++)
+ vaapi_hevc_write_st_ref_pic_set(s, ctx, i);
+
+ u(1, misc_var(long_term_ref_pics_present_flag));
+ if(misc->long_term_ref_pics_present_flag) {
+ ue(0, num_long_term_ref_pics_sps);
+ }
+
+ u(1, seq_field(sps_temporal_mvp_enabled_flag));
+ u(1, seq_field(strong_intra_smoothing_enabled_flag));
+
+ u(1, misc_var(vui_parameters_present_flag));
+ if(misc->vui_parameters_present_flag) {
+ vaapi_hevc_write_vui_parameters(s, ctx);
+ }
+
+ u(1, 0, sps_extension_present_flag);
+
+ vaapi_hevc_write_rbsp_trailing_bits(s);
+}
+
+static void vaapi_hevc_write_pps(PutBitContext *s,
+ VAAPIHEVCEncodeContext *ctx)
+{
+ VAEncPictureParameterBufferHEVC *pic = &ctx->pic_params;
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+ int i;
+
+ vaapi_hevc_write_nal_unit_header(s, NAL_PPS);
+
+ ue(pic->slice_pic_parameter_set_id, pps_pic_parameter_set_id);
+ ue(misc->seq_parameter_set_id, pps_seq_parameter_set_id);
+
+ u(1, pic_field(dependent_slice_segments_enabled_flag));
+ u(1, misc_var(output_flag_present_flag));
+ u(3, misc_var(num_extra_slice_header_bits));
+ u(1, pic_field(sign_data_hiding_enabled_flag));
+ u(1, misc_var(cabac_init_present_flag));
+
+ ue(pic_var(num_ref_idx_l0_default_active_minus1));
+ ue(pic_var(num_ref_idx_l1_default_active_minus1));
+
+ se(pic->pic_init_qp - 26, init_qp_minus26);
+
+ u(1, pic_field(constrained_intra_pred_flag));
+ u(1, pic_field(transform_skip_enabled_flag));
+
+ u(1, pic_field(cu_qp_delta_enabled_flag));
+ if(pic->pic_fields.bits.cu_qp_delta_enabled_flag)
+ ue(pic_var(diff_cu_qp_delta_depth));
+
+ se(pic_var(pps_cb_qp_offset));
+ se(pic_var(pps_cr_qp_offset));
+
+ u(1, misc_var(pps_slice_chroma_qp_offsets_present_flag));
+ u(1, pic_field(weighted_pred_flag));
+ u(1, pic_field(weighted_bipred_flag));
+ u(1, pic_field(transquant_bypass_enabled_flag));
+ u(1, pic_field(tiles_enabled_flag));
+ u(1, pic_field(entropy_coding_sync_enabled_flag));
+
+ if(pic->pic_fields.bits.tiles_enabled_flag) {
+ ue(pic_var(num_tile_columns_minus1));
+ ue(pic_var(num_tile_rows_minus1));
+ u(1, misc_var(uniform_spacing_flag));
+ if(!misc->uniform_spacing_flag) {
+ for(i = 0; i < pic->num_tile_columns_minus1; i++)
+ ue(pic_var(column_width_minus1[i]));
+ for(i = 0; i < pic->num_tile_rows_minus1; i++)
+ ue(pic_var(row_height_minus1[i]));
+ }
+ u(1, pic_field(loop_filter_across_tiles_enabled_flag));
+ }
+
+ u(1, pic_field(pps_loop_filter_across_slices_enabled_flag));
+ u(1, misc_var(deblocking_filter_control_present_flag));
+ if(misc->deblocking_filter_control_present_flag) {
+ u(1, misc_var(deblocking_filter_override_enabled_flag));
+ u(1, misc_var(pps_deblocking_filter_disabled_flag));
+ if(!misc->pps_deblocking_filter_disabled_flag) {
+ se(misc_var(pps_beta_offset_div2));
+ se(misc_var(pps_tc_offset_div2));
+ }
+ }
+
+ u(1, 0, pps_scaling_list_data_present_flag);
+ // No scaling list data.
+
+ u(1, misc_var(lists_modification_present_flag));
+ ue(pic_var(log2_parallel_merge_level_minus2));
+ u(1, 0, slice_segment_header_extension_present_flag);
+ u(1, 0, pps_extension_present_flag);
+
+ vaapi_hevc_write_rbsp_trailing_bits(s);
+}
+
+static void vaapi_hevc_write_slice_header(PutBitContext *s,
+ VAAPIHEVCEncodeContext *ctx,
+ VAAPIHEVCEncodeFrame *current)
+{
+ VAEncSequenceParameterBufferHEVC *seq = &ctx->seq_params;
+ VAEncPictureParameterBufferHEVC *pic = &current->pic_params;
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+ VAEncSliceParameterBufferHEVC *slice = &current->slice_params;
+ VAAPIHEVCEncodeMiscPictureParams *miscs = &current->misc_params;
+ int i;
+
+ vaapi_hevc_write_nal_unit_header(s, pic->nal_unit_type);
+
+ u(1, miscs_var(first_slice_segment_in_pic_flag));
+ if(pic->nal_unit_type >= NAL_BLA_W_LP &&
+ pic->nal_unit_type <= 23)
+ u(1, miscs_var(no_output_of_prior_pics_flag));
+
+ ue(slice_var(slice_pic_parameter_set_id));
+
+ if(!miscs->first_slice_segment_in_pic_flag) {
+ if(pic->pic_fields.bits.dependent_slice_segments_enabled_flag)
+ u(1, slice_field(dependent_slice_segment_flag));
+ u(av_log2((ctx->ctu_width * ctx->ctu_height) - 1) + 1,
+ miscs_var(slice_segment_address));
+ }
+ if(!slice->slice_fields.bits.dependent_slice_segment_flag) {
+ for(i = 0; i < misc->num_extra_slice_header_bits; i++)
+ u(1, miscs_var(slice_reserved_flag[i]));
+
+ ue(slice_var(slice_type));
+ if(misc->output_flag_present_flag)
+ u(1, 1, pic_output_flag);
+ if(seq->seq_fields.bits.separate_colour_plane_flag)
+ u(2, slice_field(colour_plane_id));
+ if(pic->nal_unit_type != NAL_IDR_W_RADL &&
+ pic->nal_unit_type != NAL_IDR_N_LP) {
+ u(4 + misc->log2_max_pic_order_cnt_lsb_minus4,
+ current->poc & ((1 << (misc->log2_max_pic_order_cnt_lsb_minus4 + 4)) - 1),
+ slice_pic_order_cnt_lsb);
+
+ u(1, miscs_var(short_term_ref_pic_set_sps_flag));
+ if(!miscs->short_term_ref_pic_set_sps_flag) {
+ av_assert0(0);
+ // vaapi_hevc_write_st_ref_pic_set(ctx->num_short_term_ref_pic_sets);
+ } else if(misc->num_short_term_ref_pic_sets > 1) {
+ u(av_log2(misc->num_short_term_ref_pic_sets - 1) + 1,
+ miscs_var(short_term_ref_pic_idx));
+ }
+
+ if(misc->long_term_ref_pics_present_flag) {
+ av_assert0(0);
+ }
+
+ if(seq->seq_fields.bits.sps_temporal_mvp_enabled_flag) {
+ u(1, slice_field(slice_temporal_mvp_enabled_flag));
+ }
+
+ if(seq->seq_fields.bits.sample_adaptive_offset_enabled_flag) {
+ u(1, slice_field(slice_sao_luma_flag));
+ if(!seq->seq_fields.bits.separate_colour_plane_flag &&
+ seq->seq_fields.bits.chroma_format_idc != 0) {
+ u(1, slice_field(slice_sao_chroma_flag));
+ }
+ }
+
+ if(slice->slice_type == P_SLICE || slice->slice_type == B_SLICE) {
+ u(1, slice_field(num_ref_idx_active_override_flag));
+ if(slice->slice_fields.bits.num_ref_idx_active_override_flag) {
+ ue(slice_var(num_ref_idx_l0_active_minus1));
+ if(slice->slice_type == B_SLICE) {
+ ue(slice_var(num_ref_idx_l1_active_minus1));
+ }
+ }
+
+ if(misc->lists_modification_present_flag) {
+ av_assert0(0);
+ // ref_pic_lists_modification()
+ }
+ if(slice->slice_type == B_SLICE) {
+ u(1, slice_field(mvd_l1_zero_flag));
+ }
+ if(misc->cabac_init_present_flag) {
+ u(1, slice_field(cabac_init_flag));
+ }
+ if(slice->slice_fields.bits.slice_temporal_mvp_enabled_flag) {
+ if(slice->slice_type == B_SLICE)
+ u(1, slice_field(collocated_from_l0_flag));
+ ue(pic->collocated_ref_pic_index, collocated_ref_idx);
+ }
+ if((pic->pic_fields.bits.weighted_pred_flag &&
+ slice->slice_type == P_SLICE) ||
+ (pic->pic_fields.bits.weighted_bipred_flag &&
+ slice->slice_type == B_SLICE)) {
+ ue(5 - slice->max_num_merge_cand, five_minus_max_num_merge_cand);
+ }
+ }
+
+ se(slice_var(slice_qp_delta));
+ if(misc->pps_slice_chroma_qp_offsets_present_flag) {
+ se(slice_var(slice_cb_qp_offset));
+ se(slice_var(slice_cr_qp_offset));
+ }
+ if(misc->pps_slice_chroma_offset_list_enabled_flag) {
+ u(1, 0, cu_chroma_qp_offset_enabled_flag);
+ }
+ if(misc->deblocking_filter_override_enabled_flag) {
+ u(1, miscs_var(deblocking_filter_override_flag));
+ }
+ if(miscs->deblocking_filter_override_flag) {
+ u(1, slice_field(slice_deblocking_filter_disabled_flag));
+ if(!slice->slice_fields.bits.slice_deblocking_filter_disabled_flag) {
+ se(slice_var(slice_beta_offset_div2));
+ se(slice_var(slice_tc_offset_div2));
+ }
+ }
+ if(pic->pic_fields.bits.pps_loop_filter_across_slices_enabled_flag &&
+ (slice->slice_fields.bits.slice_sao_luma_flag ||
+ slice->slice_fields.bits.slice_sao_chroma_flag ||
+ slice->slice_fields.bits.slice_deblocking_filter_disabled_flag)) {
+ u(1, slice_field(slice_loop_filter_across_slices_enabled_flag));
+ }
+ }
+
+ if(pic->pic_fields.bits.tiles_enabled_flag ||
+ pic->pic_fields.bits.entropy_coding_sync_enabled_flag) {
+ // num_entry_point_offsets
+ }
+
+ if(0) {
+ // slice_segment_header_extension_length
+ }
+ }
+
+ u(1, 1, alignment_bit_equal_to_one);
+ while(put_bits_count(s) & 7)
+ u(1, 0, alignment_bit_equal_to_zero);
+}
+
+static size_t vaapi_hevc_nal_unit_to_byte_stream(uint8_t *dst, uint8_t *src, size_t len)
+{
+ size_t dp, sp;
+ int zero_run = 0;
+
+ // Start code.
+ dst[0] = dst[1] = dst[2] = 0;
+ dst[3] = 1;
+ dp = 4;
+
+ for(sp = 0; sp < len; sp++) {
+ if(zero_run < 2) {
+ if(src[sp] == 0)
+ ++zero_run;
+ else
+ zero_run = 0;
+ } else {
+ if((src[sp] & ~3) == 0) {
+ // emulation_prevention_three_byte
+ dst[dp++] = 3;
+ }
+ zero_run = src[sp] == 0;
+ }
+ dst[dp++] = src[sp];
+ }
+
+ return dp;
+}
+
+static int vaapi_hevc_render_packed_header(VAAPIHEVCEncodeContext *ctx, int type,
+ char *data, size_t bit_len)
+{
+ VAStatus vas;
+ VABufferID id_list[2];
+ VAEncPackedHeaderParameterBuffer buffer = {
+ .type = type,
+ .bit_length = bit_len,
+ .has_emulation_bytes = 1,
+ };
+
+ vas = vaCreateBuffer(ctx->va_instance.display, ctx->va_codec.context_id,
+ VAEncPackedHeaderParameterBufferType,
+ sizeof(&buffer), 1, &buffer, &id_list[0]);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create parameter buffer for packed "
+ "header (type %d): %d (%s).\n", type, vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ vas = vaCreateBuffer(ctx->va_instance.display, ctx->va_codec.context_id,
+ VAEncPackedHeaderDataBufferType,
+ (bit_len + 7) / 8, 1, data, &id_list[1]);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create data buffer for packed "
+ "header (type %d): %d (%s).\n", type, vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ av_log(ctx, AV_LOG_DEBUG, "Packed header buffer (%d) is %#x/%#x "
+ "(%zu bits).\n", type, id_list[0], id_list[1], bit_len);
+
+ vas = vaRenderPicture(ctx->va_instance.display, ctx->va_codec.context_id,
+ id_list, 2);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to render packed "
+ "header (type %d): %d (%s).\n", type, vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ return 0;
+}
+
+static int vaapi_hevc_render_packed_vps_sps(VAAPIHEVCEncodeContext *ctx)
+{
+ PutBitContext pbc, *s = &pbc;
+ uint8_t tmp[256];
+ uint8_t buf[512];
+ size_t byte_len, nal_len;
+
+ init_put_bits(s, tmp, sizeof(tmp));
+ vaapi_hevc_write_vps(s, ctx);
+ nal_len = put_bits_count(s);
+ flush_put_bits(s);
+ byte_len = vaapi_hevc_nal_unit_to_byte_stream(buf, tmp, nal_len / 8);
+
+ init_put_bits(s, tmp, sizeof(tmp));
+ vaapi_hevc_write_sps(s, ctx);
+ nal_len = put_bits_count(s);
+ flush_put_bits(s);
+ byte_len += vaapi_hevc_nal_unit_to_byte_stream(buf + byte_len, tmp, nal_len / 8);
+
+ return vaapi_hevc_render_packed_header(ctx, VAEncPackedHeaderSequence,
+ buf, byte_len * 8);
+}
+
+static int vaapi_hevc_render_packed_pps(VAAPIHEVCEncodeContext *ctx)
+{
+ PutBitContext pbc, *s = &pbc;
+ uint8_t tmp[256];
+ uint8_t buf[512];
+ size_t byte_len, nal_len;
+
+ init_put_bits(s, tmp, sizeof(tmp));
+ vaapi_hevc_write_pps(s, ctx);
+ nal_len = put_bits_count(s);
+ flush_put_bits(s);
+ byte_len = vaapi_hevc_nal_unit_to_byte_stream(buf, tmp, nal_len / 8);
+
+ return vaapi_hevc_render_packed_header(ctx, VAEncPackedHeaderPicture,
+ buf, byte_len * 8);
+}
+
+static int vaapi_hevc_render_packed_slice(VAAPIHEVCEncodeContext *ctx,
+ VAAPIHEVCEncodeFrame *current)
+{
+ PutBitContext pbc, *s = &pbc;
+ uint8_t tmp[256];
+ uint8_t buf[512];
+ size_t byte_len, nal_len;
+
+ init_put_bits(s, tmp, sizeof(tmp));
+ vaapi_hevc_write_slice_header(s, ctx, current);
+ nal_len = put_bits_count(s);
+ flush_put_bits(s);
+ byte_len = vaapi_hevc_nal_unit_to_byte_stream(buf, tmp, nal_len / 8);
+
+ return vaapi_hevc_render_packed_header(ctx, VAEncPackedHeaderSlice,
+ buf, byte_len * 8);
+}
+
+static int vaapi_hevc_render_sequence(VAAPIHEVCEncodeContext *ctx)
+{
+ VAStatus vas;
+ VAEncSequenceParameterBufferHEVC *seq = &ctx->seq_params;
+
+ vas = vaCreateBuffer(ctx->va_instance.display, ctx->va_codec.context_id,
+ VAEncSequenceParameterBufferType,
+ sizeof(*seq), 1, seq, &ctx->seq_params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for sequence "
+ "parameters: %d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Sequence parameter buffer is %#x.\n",
+ ctx->seq_params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display, ctx->va_codec.context_id,
+ &ctx->seq_params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to send sequence parameters: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ return 0;
+}
+
+static int vaapi_hevc_render_picture(VAAPIHEVCEncodeContext *ctx,
+ VAAPIHEVCEncodeFrame *current)
+{
+ VAStatus vas;
+ VAEncPictureParameterBufferHEVC *pic = &current->pic_params;
+
+ vas = vaCreateBuffer(ctx->va_instance.display, ctx->va_codec.context_id,
+ VAEncPictureParameterBufferType,
+ sizeof(*pic), 1, pic, &ctx->pic_params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for picture "
+ "parameters: %d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Picture parameter buffer is %#x.\n",
+ ctx->pic_params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display, ctx->va_codec.context_id,
+ &ctx->pic_params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to send picture parameters: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ return 0;
+}
+
+static int vaapi_hevc_render_slice(VAAPIHEVCEncodeContext *ctx,
+ VAAPIHEVCEncodeFrame *current)
+{
+ VAStatus vas;
+ VAEncSliceParameterBufferHEVC *slice = &current->slice_params;
+
+ vas = vaCreateBuffer(ctx->va_instance.display, ctx->va_codec.context_id,
+ VAEncSliceParameterBufferType,
+ sizeof(*slice), 1, slice, &current->slice_params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for slice "
+ "parameters: %d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Slice buffer is %#x.\n", current->slice_params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display, ctx->va_codec.context_id,
+ &current->slice_params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to send slice parameters: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ return AVERROR_EXTERNAL;
+ }
+
+ return 0;
+}
+
+static av_cold int vaapi_hevc_encode_init_stream(VAAPIHEVCEncodeContext *ctx)
+{
+ VAEncSequenceParameterBufferHEVC *seq = &ctx->seq_params;
+ VAEncPictureParameterBufferHEVC *pic = &ctx->pic_params;
+ VAAPIHEVCEncodeMiscSequenceParams *misc = &ctx->misc_params;
+ int i;
+
+ memset(seq, 0, sizeof(*seq));
+ memset(pic, 0, sizeof(*pic));
+
+ {
+ // general_profile_space == 0.
+ seq->general_profile_idc = 1; // Main profile.
+ seq->general_tier_flag = 0;
+
+ seq->general_level_idc = ctx->level * 3;
+
+ seq->intra_period = 0;
+ seq->intra_idr_period = 0;
+ seq->ip_period = 0;
+
+ seq->pic_width_in_luma_samples = ctx->aligned_width;
+ seq->pic_height_in_luma_samples = ctx->aligned_height;
+
+ seq->seq_fields.bits.chroma_format_idc = 1; // 4:2:0.
+ seq->seq_fields.bits.separate_colour_plane_flag = 0;
+ seq->seq_fields.bits.bit_depth_luma_minus8 = 0; // 8-bit luma.
+ seq->seq_fields.bits.bit_depth_chroma_minus8 = 0; // 8-bit chroma.
+ // Other misc flags all zero.
+
+ // These have to come from the capabilities of the encoder. We have
+ // no way to query it, so just hardcode ones which worked for me...
+ // CTB size from 8x8 to 32x32.
+ seq->log2_min_luma_coding_block_size_minus3 = 0;
+ seq->log2_diff_max_min_luma_coding_block_size = 2;
+ // Transform size from 4x4 to 32x32.
+ seq->log2_min_transform_block_size_minus2 = 0;
+ seq->log2_diff_max_min_transform_block_size = 3;
+ // Full transform hierarchy allowed (2-5).
+ seq->max_transform_hierarchy_depth_inter = 3;
+ seq->max_transform_hierarchy_depth_intra = 3;
+
+ seq->vui_parameters_present_flag = 0;
+ }
+
+ {
+ for(i = 0; i < FF_ARRAY_ELEMS(pic->reference_frames); i++) {
+ pic->reference_frames[i].picture_id = VA_INVALID_ID;
+ pic->reference_frames[i].flags = VA_PICTURE_HEVC_INVALID;
+ }
+
+ pic->collocated_ref_pic_index = 0xff;
+
+ pic->last_picture = 0;
+
+ pic->pic_init_qp = ctx->fixed_qp;
+
+ pic->diff_cu_qp_delta_depth = 0;
+ pic->pps_cb_qp_offset = 0;
+ pic->pps_cr_qp_offset = 0;
+
+ // tiles_enabled_flag == 0, so ignore num_tile_(rows|columns)_minus1.
+
+ pic->log2_parallel_merge_level_minus2 = 0;
+
+ // No limit on size.
+ pic->ctu_max_bitsize_allowed = 0;
+
+ pic->num_ref_idx_l0_default_active_minus1 = 0;
+ pic->num_ref_idx_l1_default_active_minus1 = 0;
+
+ pic->slice_pic_parameter_set_id = 0;
+
+ pic->pic_fields.bits.screen_content_flag = 0;
+ pic->pic_fields.bits.enable_gpu_weighted_prediction = 0;
+
+ //pic->pic_fields.bits.cu_qp_delta_enabled_flag = 1;
+ }
+
+ {
+ misc->video_parameter_set_id = 5;
+ misc->seq_parameter_set_id = 5;
+
+ misc->vps_max_layers_minus1 = 0;
+ misc->vps_max_sub_layers_minus1 = 0;
+ misc->vps_temporal_id_nesting_flag = 1;
+ misc->sps_max_sub_layers_minus1 = 0;
+ misc->sps_temporal_id_nesting_flag = 1;
+
+ for(i = 0; i < 32; i++) {
+ misc->general_profile_compatibility_flag[i] =
+ (i == seq->general_profile_idc);
+ }
+
+ misc->general_progressive_source_flag = 1;
+ misc->general_interlaced_source_flag = 0;
+ misc->general_non_packed_constraint_flag = 0;
+ misc->general_frame_only_constraint_flag = 1;
+ misc->general_inbld_flag = 0;
+
+ misc->log2_max_pic_order_cnt_lsb_minus4 = 4;
+ misc->vps_sub_layer_ordering_info_present_flag = 0;
+ misc->vps_max_dec_pic_buffering_minus1[0] = 0;
+ misc->vps_max_num_reorder_pics[0] = 0;
+ misc->vps_max_latency_increase_plus1[0] = 0;
+ misc->sps_sub_layer_ordering_info_present_flag = 0;
+ misc->sps_max_dec_pic_buffering_minus1[0] = 0;
+ misc->sps_max_num_reorder_pics[0] = 0;
+ misc->sps_max_latency_increase_plus1[0] = 0;
+
+ misc->vps_timing_info_present_flag = 1;
+ misc->vps_num_units_in_tick = ctx->avctx->time_base.num;
+ misc->vps_time_scale = ctx->avctx->time_base.den;
+ misc->vps_poc_proportional_to_timing_flag = 1;
+ misc->vps_num_ticks_poc_diff_minus1 = 0;
+
+ if(ctx->input_width != ctx->aligned_width ||
+ ctx->input_height != ctx->aligned_height) {
+ misc->conformance_window_flag = 1;
+ misc->conf_win_left_offset = 0;
+ misc->conf_win_right_offset =
+ (ctx->aligned_width - ctx->input_width) / 2;
+ misc->conf_win_top_offset = 0;
+ misc->conf_win_bottom_offset =
+ (ctx->aligned_height - ctx->input_height) / 2;
+ } else {
+ misc->conformance_window_flag = 0;
+ }
+
+ misc->num_short_term_ref_pic_sets = 1;
+ misc->st_ref_pic_set[0].num_negative_pics = 1;
+ misc->st_ref_pic_set[0].num_positive_pics = 0;
+ misc->st_ref_pic_set[0].delta_poc_s0_minus1[0] = 0;
+ misc->st_ref_pic_set[0].used_by_curr_pic_s0_flag[0] = 1;
+
+ misc->vui_parameters_present_flag = 1;
+ if(ctx->avctx->sample_aspect_ratio.num != 0) {
+ misc->aspect_ratio_info_present_flag = 1;
+ if(ctx->avctx->sample_aspect_ratio.num ==
+ ctx->avctx->sample_aspect_ratio.den) {
+ misc->aspect_ratio_idc = 1;
+ } else {
+ misc->aspect_ratio_idc = 255; // Extended SAR.
+ misc->sar_width = ctx->avctx->sample_aspect_ratio.num;
+ misc->sar_height = ctx->avctx->sample_aspect_ratio.den;
+ }
+ }
+ if(1) {
+ // Should this be conditional on some of these being set?
+ misc->video_signal_type_present_flag = 1;
+ misc->video_format = 5; // Unspecified.
+ misc->video_full_range_flag = 0;
+ misc->colour_description_present_flag = 1;
+ misc->colour_primaries = ctx->avctx->color_primaries;
+ misc->transfer_characteristics = ctx->avctx->color_trc;
+ misc->matrix_coeffs = ctx->avctx->colorspace;
+ }
+ }
+
+ return 0;
+}
+
+static int vaapi_hevc_encode_init_picture(VAAPIHEVCEncodeContext *ctx,
+ VAAPIHEVCEncodeFrame *current)
+{
+ VAEncPictureParameterBufferHEVC *pic = &current->pic_params;
+ VAEncSliceParameterBufferHEVC *slice = &current->slice_params;
+ VAAPIHEVCEncodeMiscPictureParams *misc = &current->misc_params;
+ int idr = current->type == FRAME_TYPE_I;
+
+ memcpy(pic, &ctx->pic_params, sizeof(*pic));
+ memset(slice, 0, sizeof(*slice));
+ memset(misc, 0, sizeof(*misc));
+
+ {
+ memcpy(&pic->decoded_curr_pic, &current->pic, sizeof(VAPictureHEVC));
+
+ if(current->type != FRAME_TYPE_I) {
+ memcpy(&pic->reference_frames[0],
+ &current->refa->pic, sizeof(VAPictureHEVC));
+ }
+ if(current->type == FRAME_TYPE_B) {
+ memcpy(&pic->reference_frames[1],
+ &current->refb->pic, sizeof(VAPictureHEVC));
+ }
+
+ pic->coded_buf = current->coded_data_id;
+
+ pic->nal_unit_type = (idr ? NAL_IDR_W_RADL : NAL_TRAIL_R);
+
+ pic->pic_fields.bits.idr_pic_flag = (idr ? 1 : 0);
+ pic->pic_fields.bits.coding_type = (idr ? 1 : 2);
+
+ pic->pic_fields.bits.reference_pic_flag = 1;
+ }
+
+ {
+ slice->slice_segment_address = 0;
+ slice->num_ctu_in_slice = ctx->ctu_width * ctx->ctu_height;
+
+ slice->slice_type = current->type;
+ slice->slice_pic_parameter_set_id = 0;
+
+ slice->num_ref_idx_l0_active_minus1 = 0;
+ slice->num_ref_idx_l1_active_minus1 = 0;
+ memcpy(slice->ref_pic_list0, pic->reference_frames, sizeof(pic->reference_frames));
+ memcpy(slice->ref_pic_list1, pic->reference_frames, sizeof(pic->reference_frames));
+
+ slice->max_num_merge_cand = 5;
+ slice->slice_qp_delta = 0;
+
+ slice->slice_fields.bits.last_slice_of_pic_flag = 1;
+ }
+
+ {
+ misc->first_slice_segment_in_pic_flag = 1;
+
+ misc->short_term_ref_pic_set_sps_flag = 1;
+ misc->short_term_ref_pic_idx = 0;
+ }
+
+ return 0;
+}
+
+static int vaapi_hevc_encode_picture(AVCodecContext *avctx, AVPacket *pkt,
+ const AVFrame *pic, int *got_packet)
+{
+ VAAPIHEVCEncodeContext *ctx = avctx->priv_data;
+ AVVAAPISurface *input, *recon;
+ VAAPIHEVCEncodeFrame *current;
+ AVFrame *input_image, *recon_image;
+ VACodedBufferSegment *buf_list, *buf;
+ VAStatus vas;
+ int err;
+
+ av_log(ctx, AV_LOG_DEBUG, "New frame: format %s, size %ux%u.\n",
+ av_get_pix_fmt_name(pic->format), pic->width, pic->height);
+
+ av_vaapi_instance_lock(&ctx->va_instance);
+
+ if(pic->format == AV_PIX_FMT_VAAPI) {
+ input_image = 0;
+ input = (AVVAAPISurface*)pic->buf[0]->data;
+
+ } else {
+ input_image = av_frame_alloc();
+
+ err = av_vaapi_get_input_surface(&ctx->va_codec, input_image);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface to "
+ "copy input frame: %d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+
+ input = (AVVAAPISurface*)input_image->buf[0]->data;
+
+ err = av_vaapi_map_surface(input, 0);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to map input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+
+ err = av_vaapi_copy_to_surface(pic, input);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to copy to input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+
+ err = av_vaapi_unmap_surface(input, 1);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to unmap input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Using surface %#x for input image.\n",
+ input->id);
+
+ recon_image = av_frame_alloc();
+
+ err = av_vaapi_get_output_surface(&ctx->va_codec, recon_image);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface for "
+ "reconstructed frame: %d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+ recon = (AVVAAPISurface*)recon_image->buf[0]->data;
+ av_log(ctx, AV_LOG_DEBUG, "Using surface %#x for reconstructed image.\n",
+ recon->id);
+
+ if(ctx->previous_frame != ctx->current_frame) {
+ av_frame_unref(&ctx->dpb[ctx->previous_frame].avframe);
+ }
+
+ ctx->previous_frame = ctx->current_frame;
+ ctx->current_frame = (ctx->current_frame + 1) % MAX_DPB_PICS;
+ {
+ current = &ctx->dpb[ctx->current_frame];
+
+ if(ctx->poc < 0 ||
+ ctx->poc == ctx->options.idr_interval)
+ current->type = FRAME_TYPE_I;
+ else
+ current->type = FRAME_TYPE_P;
+
+ if(current->type == FRAME_TYPE_I)
+ ctx->poc = 0;
+ else
+ ++ctx->poc;
+ current->poc = ctx->poc;
+
+ if(current->type == FRAME_TYPE_I) {
+ current->refa = 0;
+ current->refb = 0;
+ } else if(current->type == FRAME_TYPE_P) {
+ current->refa = &ctx->dpb[ctx->previous_frame];
+ current->refb = 0;
+ } else {
+ av_assert0(0);
+ }
+
+ memset(&current->pic, 0, sizeof(VAPictureHEVC));
+ current->pic.picture_id = recon->id;
+ current->pic.pic_order_cnt = current->poc;
+
+ memcpy(&current->avframe, recon_image, sizeof(AVFrame));
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Encoding as frame as %s (%d).\n",
+ current->type == FRAME_TYPE_I ? "I" :
+ current->type == FRAME_TYPE_P ? "P" : "B", current->poc);
+
+ vas = vaBeginPicture(ctx->va_instance.display, ctx->va_codec.context_id,
+ input->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to attach new picture: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ vaapi_hevc_encode_init_picture(ctx, current);
+
+ if(current->type == FRAME_TYPE_I) {
+ err = vaapi_hevc_render_sequence(ctx);
+ if(err) goto fail;
+ }
+
+ err = vaapi_hevc_render_picture(ctx, current);
+ if(err) goto fail;
+
+ if(current->type == FRAME_TYPE_I) {
+ err = vaapi_hevc_render_packed_vps_sps(ctx);
+ if(err) goto fail;
+
+ err = vaapi_hevc_render_packed_pps(ctx);
+ if(err) goto fail;
+ }
+
+ err = vaapi_hevc_render_packed_slice(ctx, current);
+ if(err) goto fail;
+
+ err = vaapi_hevc_render_slice(ctx, current);
+ if(err) goto fail;
+
+ vas = vaEndPicture(ctx->va_instance.display, ctx->va_codec.context_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to start picture processing: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ vas = vaSyncSurface(ctx->va_instance.display, input->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to sync to picture completion: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ buf_list = 0;
+ vas = vaMapBuffer(ctx->va_instance.display, current->coded_data_id,
+ (void**)&buf_list);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to map output buffers: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ for(buf = buf_list; buf; buf = buf->next) {
+ av_log(ctx, AV_LOG_DEBUG, "Output buffer: %u bytes.\n", buf->size);
+
+ err = av_new_packet(pkt, buf->size);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to make output buffer "
+ "(%u bytes).\n", buf->size);
+ goto fail;
+ }
+
+ memcpy(pkt->data, buf->buf, buf->size);
+
+ if(current->type == FRAME_TYPE_I)
+ pkt->flags |= AV_PKT_FLAG_KEY;
+
+ pkt->pts = pic->pts;
+
+ *got_packet = 1;
+ }
+
+ vas = vaUnmapBuffer(ctx->va_instance.display, current->coded_data_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to unmap output buffers: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ if(pic->format != AV_PIX_FMT_VAAPI)
+ av_frame_free(&input_image);
+
+ err = 0;
+ fail:
+ av_vaapi_instance_unlock(&ctx->va_instance);
+ return err;
+}
+
+static VAConfigAttrib config_attributes[] = {
+ { .type = VAConfigAttribRTFormat,
+ .value = VA_RT_FORMAT_YUV420 },
+ { .type = VAConfigAttribRateControl,
+ .value = VA_RC_CQP },
+ { .type = VAConfigAttribEncPackedHeaders,
+ .value = 0 },
+};
+
+static av_cold int vaapi_hevc_encode_init(AVCodecContext *avctx)
+{
+ VAAPIHEVCEncodeContext *ctx = avctx->priv_data;
+ VAStatus vas;
+ int i, err;
+
+ ctx->avctx = avctx;
+
+ ctx->va_profile = VAProfileHEVCMain;
+ ctx->level = -1;
+ if(sscanf(ctx->options.level, "%d", &ctx->level) <= 0 ||
+ ctx->level < 0 || ctx->level > 63) {
+ av_log(ctx, AV_LOG_ERROR, "Invaid level '%s'.\n", ctx->options.level);
+ return AVERROR(EINVAL);
+ }
+
+ if(ctx->options.qp >= 0) {
+ ctx->rc_mode = VA_RC_CQP;
+ } else {
+ // Default to fixed-QP 26.
+ ctx->rc_mode = VA_RC_CQP;
+ ctx->options.qp = 26;
+ }
+ av_log(ctx, AV_LOG_INFO, "Using constant-QP mode at %d.\n",
+ ctx->options.qp);
+
+ err = av_vaapi_instance_init(&ctx->va_instance, 0);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "No VAAPI instance.\n");
+ return err;
+ }
+
+ ctx->input_width = avctx->width;
+ ctx->input_height = avctx->height;
+
+ ctx->aligned_width = (ctx->input_width + 15) / 16 * 16;
+ ctx->aligned_height = (ctx->input_height + 15) / 16 * 16;
+ ctx->ctu_width = (ctx->aligned_width + 31) / 32;
+ ctx->ctu_height = (ctx->aligned_height + 31) / 32;
+
+ ctx->fixed_qp = ctx->options.qp;
+
+ ctx->poc = -1;
+
+ {
+ AVVAAPIPipelineConfig *config = &ctx->va_config;
+
+ config->profile = ctx->va_profile;
+ config->entrypoint = VAEntrypointEncSlice;
+
+ config->attribute_count = FF_ARRAY_ELEMS(config_attributes);
+ config->attributes = config_attributes;
+ }
+
+ {
+ AVVAAPISurfaceConfig *config = &ctx->output_config;
+
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->av_format = AV_PIX_FMT_VAAPI;
+
+ config->image_format.fourcc = VA_FOURCC_NV12;
+ config->image_format.bits_per_pixel = 12;
+
+ config->count = MAX_DPB_PICS;
+ config->width = ctx->aligned_width;
+ config->height = ctx->aligned_height;
+
+ config->attribute_count = 0;
+ }
+
+ if(avctx->pix_fmt == AV_PIX_FMT_VAAPI) {
+ // Just use the input surfaces directly.
+ ctx->input_is_vaapi = 1;
+
+ } else {
+ AVVAAPISurfaceConfig *config = &ctx->input_config;
+
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->av_format = AV_PIX_FMT_VAAPI;
+
+ config->image_format.fourcc = VA_FOURCC_NV12;
+ config->image_format.bits_per_pixel = 12;
+
+ config->count = INPUT_PICS;
+ config->width = ctx->aligned_width;
+ config->height = ctx->aligned_height;
+
+ config->attribute_count = 0;
+
+ ctx->input_is_vaapi = 0;
+ }
+
+ av_vaapi_instance_lock(&ctx->va_instance);
+
+ err = av_vaapi_pipeline_init(&ctx->va_codec, &ctx->va_instance,
+ &ctx->va_config,
+ ctx->input_is_vaapi ? 0 : &ctx->input_config,
+ &ctx->output_config);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create codec: %d (%s).\n",
+ err, av_err2str(err));
+ goto fail;
+ }
+
+ for(i = 0; i < MAX_DPB_PICS; i++) {
+ vas = vaCreateBuffer(ctx->va_instance.display,
+ ctx->va_codec.context_id,
+ VAEncCodedBufferType,
+ 1048576, 1, 0, &ctx->dpb[i].coded_data_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create buffer for "
+ "coded data: %d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+ av_log(ctx, AV_LOG_TRACE, "Coded data buffer %d is %#x.\n",
+ i, ctx->dpb[i].coded_data_id);
+ }
+
+ av_vaapi_instance_unlock(&ctx->va_instance);
+
+ av_log(ctx, AV_LOG_INFO, "Started VAAPI H.265 encoder.\n");
+
+ vaapi_hevc_encode_init_stream(ctx);
+
+ return 0;
+
+ fail:
+ av_vaapi_instance_unlock(&ctx->va_instance);
+ return err;
+}
+
+static av_cold int vaapi_hevc_encode_close(AVCodecContext *avctx)
+{
+ VAAPIHEVCEncodeContext *ctx = avctx->priv_data;
+ int err;
+
+ av_vaapi_instance_lock(&ctx->va_instance);
+
+ err = av_vaapi_pipeline_uninit(&ctx->va_codec);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to destroy codec: %d (%s).\n",
+ err, av_err2str(err));
+ }
+
+ av_vaapi_instance_unlock(&ctx->va_instance);
+
+ err = av_vaapi_instance_uninit(&ctx->va_instance);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to uninitialised VAAPI "
+ "instance: %d (%s).\n",
+ err, av_err2str(err));
+ }
+
+ return 0;
+}
+
+#define OFFSET(member) offsetof(VAAPIHEVCEncodeContext, options.member)
+#define FLAGS (AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_ENCODING_PARAM)
+static const AVOption vaapi_hevc_options[] = {
+ { "level", "Set H.265 level",
+ OFFSET(level), AV_OPT_TYPE_STRING,
+ { .str = "52" }, 0, 0, FLAGS },
+ { "qp", "Use constant quantisation parameter",
+ OFFSET(qp), AV_OPT_TYPE_INT,
+ { .i64 = -1 }, -1, MAX_QP, FLAGS },
+ { "idr_interval", "Number of frames between IDR frames (0 = all intra)",
+ OFFSET(idr_interval), AV_OPT_TYPE_INT,
+ { .i64 = -1 }, -1, INT_MAX, FLAGS },
+ { 0 }
+};
+
+static const AVClass vaapi_hevc_class = {
+ .class_name = "VAAPI/H.265",
+ .item_name = av_default_item_name,
+ .option = vaapi_hevc_options,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+AVCodec ff_hevc_vaapi_encoder = {
+ .name = "vaapi_hevc",
+ .long_name = NULL_IF_CONFIG_SMALL("H.265 (VAAPI)"),
+ .type = AVMEDIA_TYPE_VIDEO,
+ .id = AV_CODEC_ID_HEVC,
+ .priv_data_size = sizeof(VAAPIHEVCEncodeContext),
+ .init = &vaapi_hevc_encode_init,
+ .encode2 = &vaapi_hevc_encode_picture,
+ .close = &vaapi_hevc_encode_close,
+ .priv_class = &vaapi_hevc_class,
+ .pix_fmts = (const enum AVPixelFormat[]) {
+ AV_PIX_FMT_VAAPI,
+ AV_PIX_FMT_NV12,
+ AV_PIX_FMT_NONE,
+ },
+};
--
2.6.4
Mark Thompson
2016-01-17 22:49:47 UTC
Permalink
From d1ddb63818c6ee04c7a25c5223fda9c50e19f4f4 Mon Sep 17 00:00:00 2001
From: Mark Thompson <***@jkqxz.net>
Date: Sun, 17 Jan 2016 22:16:13 +0000
Subject: [PATCH 5/5] libavfilter: add VAAPI surface converter

---
configure | 1 +
libavfilter/Makefile | 1 +
libavfilter/allfilters.c | 1 +
libavfilter/vf_vaapi_conv.c | 473 ++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 476 insertions(+)
create mode 100644 libavfilter/vf_vaapi_conv.c

diff --git a/configure b/configure
index 9da8e8b..71c0bc0 100755
--- a/configure
+++ b/configure
@@ -2913,6 +2913,7 @@ stereo3d_filter_deps="gpl"
subtitles_filter_deps="avformat avcodec libass"
super2xsai_filter_deps="gpl"
tinterlace_filter_deps="gpl"
+vaapi_conv_filter_deps="vaapi"
vidstabdetect_filter_deps="libvidstab"
vidstabtransform_filter_deps="libvidstab"
pixfmts_super2xsai_test_deps="super2xsai_filter"
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index e3e3561..9a4ca12 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -246,6 +246,7 @@ OBJS-$(CONFIG_TRANSPOSE_FILTER) += vf_transpose.o
OBJS-$(CONFIG_TRIM_FILTER) += trim.o
OBJS-$(CONFIG_UNSHARP_FILTER) += vf_unsharp.o
OBJS-$(CONFIG_USPP_FILTER) += vf_uspp.o
+OBJS-$(CONFIG_VAAPI) += vf_vaapi_conv.o
OBJS-$(CONFIG_VECTORSCOPE_FILTER) += vf_vectorscope.o
OBJS-$(CONFIG_VFLIP_FILTER) += vf_vflip.o
OBJS-$(CONFIG_VIDSTABDETECT_FILTER) += vidstabutils.o vf_vidstabdetect.o
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 1faf393..cfbfdca 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -266,6 +266,7 @@ void avfilter_register_all(void)
REGISTER_FILTER(TRIM, trim, vf);
REGISTER_FILTER(UNSHARP, unsharp, vf);
REGISTER_FILTER(USPP, uspp, vf);
+ REGISTER_FILTER(VAAPI_CONV, vaapi_conv, vf);
REGISTER_FILTER(VECTORSCOPE, vectorscope, vf);
REGISTER_FILTER(VFLIP, vflip, vf);
REGISTER_FILTER(VIDSTABDETECT, vidstabdetect, vf);
diff --git a/libavfilter/vf_vaapi_conv.c b/libavfilter/vf_vaapi_conv.c
new file mode 100644
index 0000000..79b5be0
--- /dev/null
+++ b/libavfilter/vf_vaapi_conv.c
@@ -0,0 +1,473 @@
+/*
+ * VAAPI converter (scaling and colour conversion).
+ *
+ * Copyright (C) 2016 Mark Thompson <***@jkqxz.net>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "avfilter.h"
+#include "formats.h"
+#include "internal.h"
+
+#include "libavutil/avassert.h"
+#include "libavutil/opt.h"
+#include "libavutil/pixdesc.h"
+#include "libavutil/vaapi.h"
+
+typedef struct VAAPIConvContext {
+ const AVClass *class;
+
+ AVVAAPIInstance va_instance;
+ AVVAAPIPipelineConfig va_config;
+ AVVAAPIPipelineContext va_context;
+ int pipeline_initialised;
+
+ int input_is_vaapi;
+ AVVAAPISurfaceConfig input_config;
+ AVVAAPISurfaceConfig output_config;
+
+ int output_width;
+ int output_height;
+
+ struct {
+ int output_size[2];
+ } options;
+
+} VAAPIConvContext;
+
+
+static int vaapi_conv_query_formats(AVFilterContext *avctx)
+{
+ VAAPIConvContext *ctx = avctx->priv;
+ VAStatus vas;
+ VAConfigAttrib rt_format = {
+ .type = VAConfigAttribRTFormat
+ };
+ enum AVPixelFormat pix_fmt_list[16] = {
+ AV_PIX_FMT_VAAPI,
+ };
+ int pix_fmt_count = 1, err;
+
+#if 0
+ // The Intel driver doesn't return anything useful here - it only
+ // declares support for YUV 4:2:0 formats, despite working perfectly
+ // with 32-bit RGB ones. Given another usable platform, this will
+ // need to be updated.
+ vas = vaGetConfigAttributes(ctx->va_instance.display,
+ VAProfileNone, VAEntrypointVideoProc,
+ &rt_format, 1);
+#else
+ vas = VA_STATUS_SUCCESS;
+ rt_format.value = VA_RT_FORMAT_YUV420 | VA_RT_FORMAT_RGB32;
+#endif
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to get config attributes: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ } else {
+ if(rt_format.value & VA_RT_FORMAT_YUV420) {
+ av_log(ctx, AV_LOG_DEBUG, "YUV420 formats supported.\n");
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_YUV420P;
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_NV12;
+ }
+ if(rt_format.value & VA_RT_FORMAT_YUV422) {
+ av_log(ctx, AV_LOG_DEBUG, "YUV422 formats supported.\n");
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_YUV422P;
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_YUYV422;
+ }
+ if(rt_format.value & VA_RT_FORMAT_YUV444) {
+ av_log(ctx, AV_LOG_DEBUG, "YUV444 formats supported.\n");
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_YUV444P;
+ }
+ if(rt_format.value & VA_RT_FORMAT_YUV400) {
+ av_log(ctx, AV_LOG_DEBUG, "Grayscale formats supported.\n");
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_GRAY8;
+ }
+ if(rt_format.value & VA_RT_FORMAT_RGB32) {
+ av_log(ctx, AV_LOG_DEBUG, "RGB32 formats supported.\n");
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_RGBA;
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_BGRA;
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_RGB0;
+ pix_fmt_list[pix_fmt_count++] = AV_PIX_FMT_BGR0;
+ }
+ }
+
+ pix_fmt_list[pix_fmt_count] = AV_PIX_FMT_NONE;
+
+ if(avctx->inputs[0]) {
+ err = ff_formats_ref(ff_make_format_list(pix_fmt_list),
+ &avctx->inputs[0]->out_formats);
+ if(err < 0)
+ return err;
+ }
+
+ if(avctx->outputs[0]) {
+ // Truncate the list: no support for normal output yet.
+ pix_fmt_list[1] = AV_PIX_FMT_NONE;
+
+ err = ff_formats_ref(ff_make_format_list(pix_fmt_list),
+ &avctx->outputs[0]->in_formats);
+ if(err < 0)
+ return err;
+ }
+
+ return 0;
+}
+
+static int vaapi_conv_config_pipeline(VAAPIConvContext *ctx)
+{
+ AVVAAPIPipelineConfig *config = &ctx->va_config;
+ int err;
+
+ config->profile = VAProfileNone;
+ config->entrypoint = VAEntrypointVideoProc;
+
+ config->attribute_count = 0;
+
+ av_vaapi_instance_lock(&ctx->va_instance);
+
+ err = av_vaapi_pipeline_init(&ctx->va_context, &ctx->va_instance,
+ &ctx->va_config, &ctx->input_config,
+ &ctx->output_config);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create video processing "
+ "pipeline: " "%d (%s).\n", err, av_err2str(err));
+ }
+
+ av_vaapi_instance_unlock(&ctx->va_instance);
+
+ return err;
+}
+
+static int vaapi_conv_config_input(AVFilterLink *inlink)
+{
+ AVFilterContext *avctx = inlink->dst;
+ VAAPIConvContext *ctx = avctx->priv;
+ AVVAAPISurfaceConfig *config = &ctx->input_config;
+
+ if(inlink->format == AV_PIX_FMT_VAAPI) {
+ av_log(ctx, AV_LOG_INFO, "Input is VAAPI (using incoming surfaces).\n");
+ ctx->input_is_vaapi = 1;
+ return 0;
+ }
+ ctx->input_is_vaapi = 0;
+
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->av_format = AV_PIX_FMT_VAAPI;
+
+ switch(inlink->format) {
+ case AV_PIX_FMT_BGR0:
+ case AV_PIX_FMT_BGRA:
+ config->image_format.fourcc = VA_FOURCC_BGRX;
+ config->image_format.byte_order = VA_LSB_FIRST;
+ config->image_format.bits_per_pixel = 32;
+ config->image_format.depth = 8;
+ config->image_format.red_mask = 0x00ff0000;
+ config->image_format.green_mask = 0x0000ff00;
+ config->image_format.blue_mask = 0x000000ff;
+ config->image_format.alpha_mask = 0x00000000;
+ break;
+
+ case AV_PIX_FMT_RGB0:
+ case AV_PIX_FMT_RGBA:
+ config->image_format.fourcc = VA_FOURCC_RGBX;
+ config->image_format.byte_order = VA_LSB_FIRST;
+ config->image_format.bits_per_pixel = 32;
+ config->image_format.depth = 8;
+ config->image_format.red_mask = 0x000000ff;
+ config->image_format.green_mask = 0x0000ff00;
+ config->image_format.blue_mask = 0x00ff0000;
+ config->image_format.alpha_mask = 0x00000000;
+ break;
+
+ case AV_PIX_FMT_NV12:
+ config->image_format.fourcc = VA_FOURCC_NV12;
+ config->image_format.bits_per_pixel = 12;
+ break;
+ case AV_PIX_FMT_YUV420P:
+ config->image_format.fourcc = VA_FOURCC_YV12;
+ config->image_format.bits_per_pixel = 12;
+ break;
+
+ default:
+ av_log(ctx, AV_LOG_ERROR, "Tried to configure with invalid input "
+ "format %s.\n", av_get_pix_fmt_name(inlink->format));
+ return AVERROR(EINVAL);
+ }
+
+ config->count = 4;
+ config->width = inlink->w;
+ config->height = inlink->h;
+
+ config->attribute_count = 0;
+
+ if(ctx->output_width == 0)
+ ctx->output_width = inlink->w;
+ if(ctx->output_height == 0)
+ ctx->output_height = inlink->h;
+
+ return 0;
+}
+
+static int vaapi_conv_config_output(AVFilterLink *outlink)
+{
+ AVFilterContext *avctx = outlink->src;
+ VAAPIConvContext *ctx = avctx->priv;
+ AVVAAPISurfaceConfig *config = &ctx->output_config;
+
+ av_assert0(outlink->format == AV_PIX_FMT_VAAPI);
+ outlink->w = ctx->output_width;
+ outlink->h = ctx->output_height;
+
+ config->rt_format = VA_RT_FORMAT_YUV420;
+ config->av_format = AV_PIX_FMT_VAAPI;
+
+ config->image_format.fourcc = VA_FOURCC_NV12;
+ config->image_format.bits_per_pixel = 12;
+
+ config->count = 4;
+ config->width = outlink->w;
+ config->height = outlink->h;
+
+ config->attribute_count = 0;
+
+ return vaapi_conv_config_pipeline(ctx);
+}
+
+static int vaapi_conv_filter_frame(AVFilterLink *inlink, AVFrame *pic)
+{
+ AVFilterContext *avctx = inlink->dst;
+ AVFilterLink *outlink = avctx->outputs[0];
+ VAAPIConvContext *ctx = avctx->priv;
+ AVVAAPISurface *input, *output;
+ AVFrame *input_image, *output_image;
+ VAProcPipelineParameterBuffer params;
+ VABufferID params_id;
+ VAStatus vas;
+ int err;
+
+ av_log(ctx, AV_LOG_DEBUG, "Filter frame: %s, %ux%u.\n",
+ av_get_pix_fmt_name(pic->format), pic->width, pic->height);
+
+ av_vaapi_instance_lock(&ctx->va_instance);
+
+ if(pic->data[3]) {
+ input_image = pic;
+ input = (AVVAAPISurface*)pic->buf[0]->data;
+
+ } else {
+ input_image = av_frame_alloc();
+
+ err = av_vaapi_get_input_surface(&ctx->va_context, input_image);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface to "
+ "copy input frame: %d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+
+ input = (AVVAAPISurface*)input_image->buf[0]->data;
+
+ err = av_vaapi_map_surface(input, 0);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to map input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+
+ err = av_vaapi_copy_to_surface(pic, input);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to copy to input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+
+ err = av_vaapi_unmap_surface(input, 1);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to unmap input surface: "
+ "%d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Using surface %#x for input image.\n",
+ input->id);
+
+ output_image = av_frame_alloc();
+ if(!output_image) {
+ err = AVERROR(ENOMEM);
+ goto fail;
+ }
+ av_frame_copy_props(output_image, pic);
+
+ err = av_vaapi_get_output_surface(&ctx->va_context, output_image);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface for "
+ "output frame: %d (%s).\n", err, av_err2str(err));
+ goto fail;
+ }
+ output = (AVVAAPISurface*)output_image->buf[0]->data;
+ av_log(ctx, AV_LOG_DEBUG, "Using surface %#x for output image.\n",
+ output->id);
+
+ memset(&params, 0, sizeof(params));
+
+ params.surface = input->id;
+ params.surface_region = 0;
+ params.surface_color_standard = VAProcColorStandardNone;
+
+ params.output_region = 0;
+ params.output_background_color = 0xff000000;
+ params.output_color_standard = VAProcColorStandardNone;
+
+ params.pipeline_flags = 0;
+ params.filter_flags = VA_FILTER_SCALING_HQ;
+
+ vas = vaBeginPicture(ctx->va_instance.display, ctx->va_context.context_id,
+ output->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to attach new picture: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ vas = vaCreateBuffer(ctx->va_instance.display, ctx->va_context.context_id,
+ VAProcPipelineParameterBufferType,
+ sizeof(params), 1, &params, &params_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to create parameter buffer: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+ av_log(ctx, AV_LOG_DEBUG, "Pipeline parameter buffer is %#x.\n",
+ params_id);
+
+ vas = vaRenderPicture(ctx->va_instance.display, ctx->va_context.context_id,
+ &params_id, 1);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to render parameter buffer: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ vas = vaEndPicture(ctx->va_instance.display, ctx->va_context.context_id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to start picture processing: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ vas = vaSyncSurface(ctx->va_instance.display, output->id);
+ if(vas != VA_STATUS_SUCCESS) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to sync picture completion: "
+ "%d (%s).\n", vas, vaErrorStr(vas));
+ err = AVERROR_EXTERNAL;
+ goto fail;
+ }
+
+ av_frame_free(&input_image);
+ if(pic->format != AV_PIX_FMT_VAAPI)
+ av_frame_free(&pic);
+
+ av_vaapi_instance_unlock(&ctx->va_instance);
+
+ return ff_filter_frame(outlink, output_image);
+
+ fail:
+ av_vaapi_instance_unlock(&ctx->va_instance);
+ return err;
+}
+
+static av_cold int vaapi_conv_init(AVFilterContext *avctx)
+{
+ VAAPIConvContext *ctx = avctx->priv;
+ int err;
+
+ err = av_vaapi_instance_init(&ctx->va_instance, 0);
+ if(err) {
+ av_log(ctx, AV_LOG_ERROR, "No VAAPI instance.\n");
+ return err;
+ }
+
+ ctx->output_width = ctx->options.output_size[0];
+ ctx->output_height = ctx->options.output_size[1];
+
+ return 0;
+}
+
+static av_cold void vaapi_conv_uninit(AVFilterContext *avctx)
+{
+ VAAPIConvContext *ctx = avctx->priv;
+
+ if(ctx->pipeline_initialised) {
+ av_vaapi_instance_lock(&ctx->va_instance);
+ av_vaapi_pipeline_uninit(&ctx->va_context);
+ av_vaapi_instance_unlock(&ctx->va_instance);
+ }
+
+ av_vaapi_instance_uninit(&ctx->va_instance);
+}
+
+
+#define OFFSET(member) offsetof(VAAPIConvContext, options.member)
+#define FLAGS (AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_FILTERING_PARAM)
+static const AVOption vaapi_conv_options[] = {
+ { "size", "Set output size",
+ OFFSET(output_size), AV_OPT_TYPE_IMAGE_SIZE,
+ { 0 }, 0, 0, FLAGS },
+ { 0 },
+};
+
+static const AVClass vaapi_conv_class = {
+ .class_name = "VAAPI/conv",
+ .item_name = av_default_item_name,
+ .option = vaapi_conv_options,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+static const AVFilterPad vaapi_conv_inputs[] = {
+ {
+ .name = "default",
+ .type = AVMEDIA_TYPE_VIDEO,
+ .filter_frame = &vaapi_conv_filter_frame,
+ .config_props = &vaapi_conv_config_input,
+ },
+ { 0 }
+};
+
+static const AVFilterPad vaapi_conv_outputs[] = {
+ {
+ .name = "default",
+ .type = AVMEDIA_TYPE_VIDEO,
+ .config_props = &vaapi_conv_config_output,
+ },
+ { 0 }
+};
+
+AVFilter ff_vf_vaapi_conv = {
+ .name = "vaapi_conv",
+ .description = NULL_IF_CONFIG_SMALL("Convert to/from VAAPI surfaces."),
+ .priv_size = sizeof(VAAPIConvContext),
+ .init = &vaapi_conv_init,
+ .uninit = &vaapi_conv_uninit,
+ .query_formats = &vaapi_conv_query_formats,
+ .inputs = vaapi_conv_inputs,
+ .outputs = vaapi_conv_outputs,
+ .priv_class = &vaapi_conv_class,
+};
--
2.6.4
Continue reading on narkive:
Loading...