[FFmpeg-devel] [PATCH] nvenc: Compensate for hardware trying to mess with aspect ratio of DVD content.

Thanks for testing.

Post by Philip Langdale
---
libavcodec/nvenc.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c
index efa3f04..a51ada2 100644
--- a/libavcodec/nvenc.c
+++ b/libavcodec/nvenc.c
@@ -587,6 +587,13 @@ static av_cold int nvenc_encode_init(AVCodecContext *avctx)
ctx->init_encode_params.darWidth = avctx->width;
}
+ // Compensate for hardware assuming playback will be at 704 pixel width.

Nit: de-compensate for buggy encoders that compensate for...

Post by Philip Langdale
+ if (avctx->width == 720 &&
+ (avctx->height == 480 || avctx->height == 576)) {
+ ctx->init_encode_params.darWidth *= 44;
+ ctx->init_encode_params.darHeight *= 45;
+ }

Here, and also in Timo's patch, I would advice to use lavu's rational
operators: they will handle overflow better and reduce the result.

Post by Philip Langdale
+
ctx->init_encode_params.frameRateNum = avctx->time_base.den;
ctx->init_encode_params.frameRateDen = avctx->time_base.num * avctx->ticks_per_frame;

Regards,

--
Nicolas George

Philip Langdale

2015-01-17 20:01:49 UTC

There is a long sad story behind all this, but it's somewhat ambiguous as to
whether DVD content should be treated as 720 pixels wide or 704 pixels, with
16 pixels cut off. If you decide is should be 704 pixels wide, you need to
adjust the sample aspect ratio to keep the final display aspect ratio correct.

For reasons we are not privy too, nvidia decided that the nvenc encoder should
apply this aspect correction, whether you want it to or not. (I guess there
might be a flag for it, but if there is it's not documented). So, if you want
to transcode DVD content at the original size, you need to adjust the aspect
ratio information you pass to the encoder to compensate for their 'correction'.

This 'correction' is only applied to 720x480 and 720x576 content - and it does
so regardless of the input aspect ratio.
---
libavcodec/nvenc.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c
index efa3f04..8a0d584 100644
--- a/libavcodec/nvenc.c
+++ b/libavcodec/nvenc.c
@@ -587,6 +587,18 @@ static av_cold int nvenc_encode_init(AVCodecContext *avctx)
ctx->init_encode_params.darWidth = avctx->width;
}

+ // De-compensate for hardware, dubiously, trying to compensate for
+ // playback at 704 pixel width.
+ if (avctx->width == 720 &&
+ (avctx->height == 480 || avctx->height == 576)) {
+ av_reduce(&dw, &dh,
+ ctx->init_encode_params.darWidth * 44,
+ ctx->init_encode_params.darHeight * 45,
+ 1024 * 1204);
+ ctx->init_encode_params.darHeight = dh;
+ ctx->init_encode_params.darWidth = dw;
+ }
+
ctx->init_encode_params.frameRateNum = avctx->time_base.den;
ctx->init_encode_params.frameRateDen = avctx->time_base.num * avctx->ticks_per_frame;

--
2.1.0

Kieran Kunhya

2015-01-17 20:11:33 UTC

BT601 makes this very clear. The active picture is 702 pixels.

Kieran

Nicolas George

2015-01-17 20:42:43 UTC

Post by Kieran Kunhya
BT601 makes this very clear. The active picture is 702 pixels.

There are two fundamental flaws with your reasoning:

First, BT601 only applies to a some kind of videos. Wikipedia tells me it
applies to "encoding interlaced analog video signals in digital video form";
do you think it is wrong? For all other kind of videos, all this nonsense is
just irrelevant.

Second, and that is more important, the SARâDAR conversion is not a matter
of conforming to a certain video standard, it is a matter of basic
arithmetic. Arithmetic has been around a lot longer than BT601, or any
industrial standard for video. If you have a "BT601 720ishÃ576 16/9 video",
then SAR must be (16/9) / (702/576) = 512/351, nothing else; reversely, if
the SAR is 64/45, that means the video is really 720Ã576 for 16/9 without
any nonsense.

Regards,

--
Nicolas George

Kieran Kunhya

2015-01-17 22:49:26 UTC

Post by Kieran Kunhya
BT601 makes this very clear. The active picture is 702 pixels.

First, BT601 only applies to a some kind of videos. Wikipedia tells me it
applies to "encoding interlaced analog video signals in digital video form";
do you think it is wrong? For all other kind of videos, all this nonsense is
just irrelevant.

And the digital signals that are recorded by my camera need to be
backwards compatible with the hundreds of millions of analogue TVs.
This forms almost the entirety of the set of 720x480 or 720x576 videos.

I don't make the standards and frankly whether you dislike them is
your problem but they exist and need to work correctly.
Instead you wish to break things based off an artificial test pattern
and your own beliefs.

Wikipedia seems to be wrong in a number of areas.

Kieran

Nicolas George

2015-01-17 23:00:26 UTC

Post by Kieran Kunhya
I don't make the standards and frankly whether you dislike them is
your problem but they exist and need to work correctly.
Instead you wish to break things based off an artificial test pattern
and your own beliefs.

I do not wish to break things, I wish to fix them correctly.

The so-called fix in nvenc is completely wrong, because it happens at the
wrong place: it introduces exceptional cases, it violates the rules of
arithmetic, it is not documented, it affects cases that should be left alone
and misses cases it should handle, it behaved differently than other parts
of the code. Is it enough, or do you need more reasons?

If the aspect ratio needs to be fixed, it must happen as soon as the video
enters FFmpeg, i.e. at the demuxer or decoder level. Any other place is
wrong.

As is, Philip's patch fixes nvenc's misbehaviour, making it consistent with
arithmetic and other encoders: the latest version should be applied soon.
You, or anyone else who knows what they are doing, can file another patch to
properly fix the issue, if it happens there is anything to fix.

Regards,

--
Nicolas George

Kieran Kunhya

2015-01-17 23:30:38 UTC

The behaviour of the Nvidia code I believe is correct.
As far as I understand it corrects SAR for 720-width content to comply
with BT601.

Kieran

Nicolas George

2015-01-17 23:38:38 UTC

Post by Kieran Kunhya
The behaviour of the Nvidia code I believe is correct.
As far as I understand it corrects SAR for 720-width content to comply
with BT601.

That is just not true.

Basic principle: THE COMPETENT USER IS RIGHT.

If someone knows BT601, and knowing it decide to produce a video that does
not conform to it, for example writing "testsrc=s=720x576,setsar=64/45",
then the software must comply.

There is a very simple way of flagging content that is supposed to comply
with BT601: the SAR is 512/351. If SAR is 64/45, that means someone before
nvenc decided that the video is not expected to conform with BT601, and
nvenc has no right to decide otherwise.

Regards,

--
Nicolas George

Kieran Kunhya

2015-01-17 23:46:21 UTC

Post by Kieran Kunhya
The behaviour of the Nvidia code I believe is correct.
As far as I understand it corrects SAR for 720-width content to comply
with BT601.

No, the user who doesn't want BT601 should override it.

You don't realise this but there are many orders of magnitude more
BT601 content than any other content with a 720x480 or 720x576
resolution. This is what the vast majority of users are processing
with FFmpeg.

Post by Nicolas George
There is a very simple way of flagging content that is supposed to comply
with BT601: the SAR is 512/351. If SAR is 64/45, that means someone before
nvenc decided that the video is not expected to conform with BT601, and
nvenc has no right to decide otherwise.

You fail to realise this but SARs are not necessarily arithmetic for
historical reasons.

Kieran

Kieran Kunhya

2015-01-17 23:48:31 UTC

Post by Kieran Kunhya
The behaviour of the Nvidia code I believe is correct.
As far as I understand it corrects SAR for 720-width content to comply
with BT601.

Oops I misunderstood, you mean the software must comply with the users wishes.
Anyway my argument is that BT601 should be the default for the these
resolutions.

Kieran

Nicolas George

2015-01-18 00:02:10 UTC

Post by Kieran Kunhya
Oops I misunderstood, you mean the software must comply with the users wishes.

Yes.

Post by Kieran Kunhya
Anyway my argument is that BT601 should be the default for the these
resolutions.

If you want, but that must happen immediately when the contents enters into
FFmpeg's data structure, not at a random point in the processing chain.

In other words, it should happen immediately when FFmpeg reads contents that
is "probably" BT601:

Input stream #0:0: Video: mpeg2video (Main), yuv420p(tv),
720x576 [SAR 64:45 DAR 16:9]

should be fixed into:

Input stream #0:0: Video: mpeg2video (Main), yuv420p(tv),
720x576 [SAR 512:351 DAR 640:351]

That way, the laws of geometry and arithmetic are respected, the size of the
pixels is correct, and it will work will all encoders, not just nvenc.

Regards,

--
Nicolas George

Hendrik Leppkes

2015-01-18 01:52:59 UTC

Post by Kieran Kunhya
Oops I misunderstood, you mean the software must comply with the users

wishes.
Yes.

Post by Kieran Kunhya
Anyway my argument is that BT601 should be the default for the these
resolutions.

If you want, but that must happen immediately when the contents enters into
FFmpeg's data structure, not at a random point in the processing chain.

Without arguing for or against BT.601 behaviour...

nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.

If bt601 needs special handling not yet present in avcodec, it should be
implemented in such a way that all encoders can benefit from it.

- Hendrik

Nicolas George

2015-01-18 09:40:24 UTC

Post by Hendrik Leppkes
nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.
If bt601 needs special handling not yet present in avcodec, it should be
implemented in such a way that all encoders can benefit from it.

That was one of the points I was trying to make, thanks for explaining it
more clearly.

Two additional points:

According to Philip's tests, nvenc adds a 45/44 = 720/704 factor, not
720/702 = 40/39, so it does not conform to BT.601 anyway.

I have downloaded BT.601 from <URL: http://www.itu.int/rec/R-REC-BT.601/ >,
both the most recent and the oldest versions, and I did not find any
explicit reference of this 702 business, nor could I decipher indirect
references, so I still wonder where it comes from exactly.

(I had previously assumed it would be insanely expensive like the MPEG
standards, but it is free and short. On the other hand, I do not expect
people who do not bother to properly configure their character encodings to
produce good standards.)

Regards,

--
Nicolas George

Kieran Kunhya

2015-01-18 11:16:04 UTC

That was one of the points I was trying to make, thanks for explaining it
more clearly.
According to Philip's tests, nvenc adds a 45/44 = 720/704 factor, not
720/702 = 40/39, so it does not conform to BT.601 anyway.

Why is it so hard for you to realise that the SARs are not
arithmetically derived?

Post by Nicolas George
I have downloaded BT.601 from <URL: http://www.itu.int/rec/R-REC-BT.601/ >,
both the most recent and the oldest versions, and I did not find any
explicit reference of this 702 business, nor could I decipher indirect
references, so I still wonder where it comes from exactly.

You can make the appropriate calculations *and simplifications* for
BT601. There are plenty of books with the appropriate derivations.

Kieran

Michael Niedermayer

2015-01-18 12:31:48 UTC

Why is it so hard for you to realise that the SARs are not
arithmetically derived?

Rec. ITU-T H.264 (04/2013)

3.133 sample aspect ratio: Specifies, for assisting the display process, which is not specified in this
Recommendation | International Standard, the ratio between the intended horizontal distance between the
columns and the intended vertical distance between the rows of the luma sample array in a frame. Sample
aspect ratio is expressed as h:v, where h is horizontal width and v is vertical height (in arbitrary units of spatial
distance).

3.62 informative: A term used to refer to content provided in this Recommendation | International Standard that is
not an integral part of this Recommendation | International Standard. Informative content does not establish
any mandatory requirements for conformance to this Recommendation | International Standard.

aspect_ratio_idc specifies the value of the sample aspect ratio of the luma samples. Table E-1 shows the meaning of the
code. When aspect_ratio_idc indicates Extended_SAR, the sample aspect ratio is represented by sar_width : sar_height.
When the aspect_ratio_idc syntax element is not present, aspect_ratio_idc value shall be inferred to be equal to 0.
Table E-1 Meaning of sample aspect ratio indicator
aspect_ratio_idc Sample aspect (informative)
ratio Examples of use
0 Unspecified
1 1:1 7680x4320 16:9 frame without horizontal overscan
("square") 3840x2160 16:9 frame without horizontal overscan
1280x720 16:9 frame without horizontal overscan
1920x1080 16:9 frame without horizontal overscan (cropped from 1920x1088)
640x480 4:3 frame without horizontal overscan
2 12:11 720x576 4:3 frame with horizontal overscan
352x288 4:3 frame without horizontal overscan
3 10:11 720x480 4:3 frame with horizontal overscan
352x240 4:3 frame without horizontal overscan
4 16:11 720x576 16:9 frame with horizontal overscan
528x576 4:3 frame without horizontal overscan
5 40:33 720x480 16:9 frame with horizontal overscan
528x480 4:3 frame without horizontal overscan
6 24:11 352x576 4:3 frame without horizontal overscan
480x576 16:9 frame with horizontal overscan
7 20:11 352x480 4:3 frame without horizontal overscan
480x480 16:9 frame with horizontal overscan
8 32:11 352x576 16:9 frame without horizontal overscan
9 80:33 352x480 16:9 frame without horizontal overscan
10 18:11 480x576 4:3 frame with horizontal overscan
11 15:11 480x480 4:3 frame with horizontal overscan
12 64:33 528x576 16:9 frame without horizontal overscan
13 160:99 528x480 16:9 frame without horizontal overscan
14 4:3 1440x1080 16:9 frame without horizontal overscan
15 3:2 1280x1080 16:9 frame without horizontal overscan
16 2:1 960x1080 16:9 frame without horizontal overscan
17..254 Reserved
255 Extended_SAR
NOTE 1 For the examples in Table E-1, the term "without horizontal overscan" refers to display processes in which the display
area matches the area of the cropped decoded pictures and the term "with horizontal overscan" refers to display processes in which
some parts near the left and/or right border of the cropped decoded pictures are not visible in the display area. As an example, the
entry "720x576 4:3 frame with horizontal overscan" for aspect_ratio_idc equal to 2 refers to having an area of 704x576 luma
samples (which has an aspect ratio of 4:3) of the cropped decoded frame (720x576 luma samples) that is visible in the display
area.
sar_width indicates the horizontal size of the sample aspect ratio (in arbitrary units).
sar_height indicates the vertical size of the sample aspect ratio (in the same arbitrary units as sar_width).
sar_width and sar_height shall be relatively prime or equal to 0. When aspect_ratio_idc is equal to 0 or sar_width is
equal to 0 or sar_height is equal to 0, the sample aspect ratio shall be considered unspecified by this Recommendation |
International Standard.

overscan_info_present_flag equal to 1 specifies that the overscan_appropriate_flag is present. When
overscan_info_present_flag is equal to 0 or is not present, the preferred display method for the video signal is
unspecified.
overscan_appropriate_flag equal to 1 indicates that the cropped decoded pictures output are suitable for display using
overscan. overscan_appropriate_flag equal to 0 indicates that the cropped decoded pictures output contain visually
important information in the entire region out to the edges of the cropping rectangle of the picture, such that the cropped
decoded pictures output should not be displayed using overscan. Instead, they should be displayed using either an exact
match between the display area and the cropping rectangle, or using underscan. As used in this paragraph, the term
"overscan" refers to display processes in which some parts near the borders of the cropped decoded pictures are not
visible in the display area. The term "underscan" describes display processes in which the entire cropped decoded
pictures are visible in the display area, but they do not cover the entire display area. For display processes that neither use
overscan nor underscan, the display area exactly matches the area of the cropped decoded pictures.
NOTE 2 For example, overscan_appropriate_flag equal to 1 might be used for entertainment television programming, or for a
live view of people in a videoconference, and overscan_appropriate_flag equal to 0 might be used for computer screen capture or
security camera content.

[...]

--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

There will always be a question for which you do not know the correct answer.

compn

2015-01-18 13:50:53 UTC

On Sun, 18 Jan 2015 10:40:24 +0100

http://www.itu.int/rec/R-REC-BT.601/ >, both the most recent and the
oldest versions, and I did not find any explicit reference of this
702 business, nor could I decipher indirect references, so I still
wonder where it comes from exactly.

i've seen the "SD content is bt.601 and HD content is bt.709" edict
before, but i do not know where it comes from either.

-compn

Michael Niedermayer

2015-01-18 12:22:43 UTC

Post by Kieran Kunhya
Oops I misunderstood, you mean the software must comply with the users

wishes.
Yes.

Post by Kieran Kunhya
Anyway my argument is that BT601 should be the default for the these
resolutions.

If you want, but that must happen immediately when the contents enters into
FFmpeg's data structure, not at a random point in the processing chain.

Without arguing for or against BT.601 behaviour...
nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.

Yes, and AFAICS an earlier mail from philip indicates this is the case
"
Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 720x576
[SAR 64:45 DAR 16:9], 25 tbr, 25 tbn, 25 tbc

With libx264, I get the same SAR and DAR out.

With nvenc, I get:

sample_aspect_ratio=16:11
display_aspect_ratio=20:11
"

In above example the user asks for a NON BT.601 format 720x576 at
SAR 64:45 and a DAR related to 720x576 NOT 702 or 704x576 of 16:9
yet nvenc gives her something else

[...]

--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The bravest are surely those who have the clearest vision
of what is before them, glory and danger alike, and yet
notwithstanding go out to meet it. -- Thucydides

Kieran Kunhya

2015-01-18 13:28:36 UTC

Post by Kieran Kunhya
Oops I misunderstood, you mean the software must comply with the users

wishes.
Yes.

Post by Kieran Kunhya
Anyway my argument is that BT601 should be the default for the these
resolutions.

If you want, but that must happen immediately when the contents enters into
FFmpeg's data structure, not at a random point in the processing chain.

Without arguing for or against BT.601 behaviour...
nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.

Yes, and AFAICS an earlier mail from philip indicates this is the case
"
Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 720x576
[SAR 64:45 DAR 16:9], 25 tbr, 25 tbn, 25 tbc
With libx264, I get the same SAR and DAR out.
sample_aspect_ratio=16:11
display_aspect_ratio=20:11
"
In above example the user asks for a NON BT.601 format 720x576 at
SAR 64:45 and a DAR related to 720x576 NOT 702 or 704x576 of 16:9
yet nvenc gives her something else

Here's the point where I give up and start my own fork where things
are done properly.

Michael Niedermayer

2015-01-18 13:55:01 UTC

Post by Kieran Kunhya
Oops I misunderstood, you mean the software must comply with the users

wishes.
Yes.

Post by Kieran Kunhya
Anyway my argument is that BT601 should be the default for the these
resolutions.

If you want, but that must happen immediately when the contents enters into
FFmpeg's data structure, not at a random point in the processing chain.

Without arguing for or against BT.601 behaviour...
nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.

Yes, and AFAICS an earlier mail from philip indicates this is the case
"
Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 720x576
[SAR 64:45 DAR 16:9], 25 tbr, 25 tbn, 25 tbc
With libx264, I get the same SAR and DAR out.
sample_aspect_ratio=16:11
display_aspect_ratio=20:11
"
In above example the user asks for a NON BT.601 format 720x576 at
SAR 64:45 and a DAR related to 720x576 NOT 702 or 704x576 of 16:9
yet nvenc gives her something else

Here's the point where I give up and start my own fork where things
are done properly.

please elaborate about where you disagree

I think theres a misunderstanding between people on what some of the
aspect ratios mean and what they refer to

[...]

--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Let us carefully observe those good qualities wherein our enemies excel us
and endeavor to excel them, by avoiding what is faulty, and imitating what
is excellent in them. -- Plutarch

Kieran Kunhya

2015-01-18 14:13:00 UTC

Post by Kieran Kunhya
Oops I misunderstood, you mean the software must comply with the users

wishes.
Yes.

Post by Kieran Kunhya
Anyway my argument is that BT601 should be the default for the these
resolutions.

If you want, but that must happen immediately when the contents enters into
FFmpeg's data structure, not at a random point in the processing chain.

Without arguing for or against BT.601 behaviour...
nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.

Yes, and AFAICS an earlier mail from philip indicates this is the case
"
Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 720x576
[SAR 64:45 DAR 16:9], 25 tbr, 25 tbn, 25 tbc
With libx264, I get the same SAR and DAR out.
sample_aspect_ratio=16:11
display_aspect_ratio=20:11
"
In above example the user asks for a NON BT.601 format 720x576 at
SAR 64:45 and a DAR related to 720x576 NOT 702 or 704x576 of 16:9
yet nvenc gives her something else

Here's the point where I give up and start my own fork where things
are done properly.

please elaborate about where you disagree
I think theres a misunderstanding between people on what some of the
aspect ratios mean and what they refer to

Read
https://web.archive.org/web/20130328121216/http://lipas.uwasa.fi/~f76998/video/conversion/

and ETSI TS 101 154 (specifically the section on h264 aspect_ratio_idc):
www.etsi.org/deliver/etsi_ts/101100_101199/101154/01.11.01_60/ts_101154v011101p.pdf

Kieran

Nicolas George

2015-01-17 23:52:45 UTC

You fail to realise this but SARs are not necessarily arithmetic for
historical reasons.

Care to explain how historical reasons can change the laws of geometry?

If a visible surface with a physical aspect ratio of 16/9 is cut into
702×576 identical rectangular pixels, then the aspect ratio of each pixel is
(16/9) / (702/576) = 512/351. Can you give one good reason for FFmpeg to
use, internally, any other value?

Regards,

--
Nicolas George

tim nicholson

2015-01-19 15:16:01 UTC

You fail to realise this but SARs are not necessarily arithmetic for
historical reasons.

Care to explain how historical reasons can change the laws of geometry?
If a visible surface with a physical aspect ratio of 16/9 is cut into
702×576 identical rectangular pixels, then the aspect ratio of each pixel is
(16/9) / (702/576) = 512/351. Can you give one good reason for FFmpeg to
use, internally, any other value?

I am struggling to follow some of the arguments, which seem far to
heated to be as useful as they could be.

However, as I understand the issue, whilst the above maths sound fine,
the visible surface mentioned above is framed within an actual surface
of 720x576 pixels which includes the 'digital overscan'

Taking the Sample/pixel aspect ratio (SAR) calculated above and feeding
that back into calculating a display aspect ratio *for the whole frame*
leads to a value that is not 16:9 (or 4:3), and it appears to be this
figure which is then reported as the DAR, rather than that of the active
picture. This is I think the disjoint between DAR and SAR that Kieran
refers to, there is only a mathematical relationship if you include the
overscan.

It would appear that in some cases the SAR is calculated back from a
specified DAR of 16/9 being applied as the aspect ratio of the whole
frame and this seems to be the usual cause of issues (Adobe had to admit
they had got it wrong for years as they had assumed that 720x576 was 4/3
aspect ratio).

I am not sure what is currently wrong, and where, but please let us not
'fix' one scenario with a dirty hack that breaks hoers.

Post by Nicolas George
Regards,

- --
Tim.
Key Fingerprint 38CF DB09 3ED0 F607 8B67 6CED 0C0B FC44 8B0B FC83

Nicolas George

2015-01-19 15:18:31 UTC

Post by tim nicholson
I am struggling to follow some of the arguments, which seem far to
heated to be as useful as they could be.

Please see my other mail, I summarize it more extensively.

Post by tim nicholson
I am not sure what is currently wrong, and where, but please let us not
'fix' one scenario with a dirty hack that breaks hoers.

That is exactly what nvidia is doing and the patch is trying to un-do.

Regards,

--
Nicolas George

Kieran Kunhya

2015-01-19 23:58:05 UTC

Post by tim nicholson
Taking the Sample/pixel aspect ratio (SAR) calculated above and feeding
that back into calculating a display aspect ratio *for the whole frame*
leads to a value that is not 16:9 (or 4:3), and it appears to be this
figure which is then reported as the DAR, rather than that of the active
picture. This is I think the disjoint between DAR and SAR that Kieran
refers to, there is only a mathematical relationship if you include the
overscan.

At least for the DVB/DVD/Blu-ray mandated aspect ratios (e.g 12:11 and
16:11 for PAL) you have to use further approximations (which is what
the nvidia driver is doing).

Kieran

Agatha Hu

2015-01-21 06:17:09 UTC

Post by Philip Langdale
There is a long sad story behind all this, but it's somewhat ambiguous as to
whether DVD content should be treated as 720 pixels wide or 704 pixels, with
16 pixels cut off. If you decide is should be 704 pixels wide, you need to
adjust the sample aspect ratio to keep the final display aspect ratio correct.
For reasons we are not privy too, nvidia decided that the nvenc encoder should
apply this aspect correction, whether you want it to or not. (I guess there
might be a flag for it, but if there is it's not documented). So, if you want
to transcode DVD content at the original size, you need to adjust the aspect
ratio information you pass to the encoder to compensate for their 'correction'.
This 'correction' is only applied to 720x480 and 720x576 content - and it does
so regardless of the input aspect ratio.
---
libavcodec/nvenc.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c
index efa3f04..8a0d584 100644
--- a/libavcodec/nvenc.c
+++ b/libavcodec/nvenc.c
@@ -587,6 +587,18 @@ static av_cold int nvenc_encode_init(AVCodecContext *avctx)
ctx->init_encode_params.darWidth = avctx->width;
}
+ // De-compensate for hardware, dubiously, trying to compensate for
+ // playback at 704 pixel width.
+ if (avctx->width == 720 &&
+ (avctx->height == 480 || avctx->height == 576)) {
+ av_reduce(&dw, &dh,
+ ctx->init_encode_params.darWidth * 44,
+ ctx->init_encode_params.darHeight * 45,
+ 1024 * 1204);
+ ctx->init_encode_params.darHeight = dh;
+ ctx->init_encode_params.darWidth = dw;
+ }
+
ctx->init_encode_params.frameRateNum = avctx->time_base.den;
ctx->init_encode_params.frameRateDen = avctx->time_base.num * avctx->ticks_per_frame;

Here's the reply from NVENC engineers
It is not same thing as SAR. It is the display aspect ratio i.e
width/height = DAR/SAR

The calculation below should be like

If (avctx->sample_aspect_ratio.num > 0 &&
avctx->sample_aspect_ratio.den > 0 )
av_reduce(&dw, &dh, avctx->sample_aspect_ratio.num * avctx->width,
avctx->sample_aspect_ratio.den * avctx->height, INT_MAX);

else
av_reduce(&dw, &dh, avctx->width, avctx->height, INT_MAX);

ctx->init_encode_params.darHeight = dw;
ctx->init_encode_params.darWidth = dh;

Agatha Hu

Reimar Döffinger

2015-01-21 08:00:36 UTC

Post by Agatha Hu

Agatha Hu

2015-01-22 04:15:44 UTC

Post by Reimar DÃ¶ffinger

Post by Agatha Hu
Here's the reply from NVENC engineers
It is not same thing as SAR. It is the display aspect ratio i.e width/height = DAR/SAR
The calculation below should be like
If (avctx->sample_aspect_ratio.num > 0 && avctx->sample_aspect_ratio.den > 0 )
av_reduce(&dw, &dh, avctx->sample_aspect_ratio.num * avctx->width, avctx->sample_aspect_ratio.den * avctx->height, INT_MAX);

That doesn't answer anything.
Doing that, to keep the problem description short, creates different results from libx264 and any other encoder we have for DVD resolution which is highly undesirable for consistency and compatibility reasons if nothing else.
_______________________________________________
ffmpeg-devel mailing list
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

We will fix the issue in driver, overscan compensation will be applied
to input DAR *only* if the DAR is 4:3 or 16:9, otherwise won't
unnecessarily modify the aspect ratio for resolutions like 720x480 and
720x576.
The fix will be on future driver, I'll let you know when the branch is
released.

Agatha Hu

Philip Langdale

2015-01-22 04:29:25 UTC

On Thu, 22 Jan 2015 12:15:44 +0800

Post by Agatha Hu
We will fix the issue in driver, overscan compensation will be
applied to input DAR *only* if the DAR is 4:3 or 16:9, otherwise
won't unnecessarily modify the aspect ratio for resolutions like
720x480 and 720x576.
The fix will be on future driver, I'll let you know when the branch
is released.

Please, no. We do not want overscan compensation to be done by nvenc at
all. The compensation makes nvenc inconsistent with *all* other
encoders in ffmpeg. For every other encoder, if my input is 720x576 at
[64/45], then my output is the same. It does not do anyone any favours
for nvenc to implicitly alter the output DAR.

There is a separate discussion about whether DAR 'compensation' is
desirable in an encoder or not, but even if that discussion results in
a decision to apply DAR compensation at encode time, it will be done
globally in ffmpeg for all encoders - ie: at a layer above any
individual codec.

If nvidia feels it is important to offer this compensation 'feature',
then I would ask for a configuration option to turn it off - otherwise
we will need the 'de-compensation' logic in the patch I posted.

Thanks,

--phil

Agatha Hu

2015-01-22 05:12:59 UTC

Post by Philip Langdale
On Thu, 22 Jan 2015 12:15:44 +0800

Please, no. We do not want overscan compensation to be done by nvenc at
all. The compensation makes nvenc inconsistent with *all* other
encoders in ffmpeg. For every other encoder, if my input is 720x576 at
[64/45], then my output is the same. It does not do anyone any favours
for nvenc to implicitly alter the output DAR.
There is a separate discussion about whether DAR 'compensation' is
desirable in an encoder or not, but even if that discussion results in
a decision to apply DAR compensation at encode time, it will be done
globally in ffmpeg for all encoders - ie: at a layer above any
individual codec.
If nvidia feels it is important to offer this compensation 'feature',
then I would ask for a configuration option to turn it off - otherwise
we will need the 'de-compensation' logic in the patch I posted.
Thanks,
--phil

Disabling compensation is the best result, adding a flag is acceptable,
but NVENC5.0 has just released, such option won't be available until 6.0
is available, so...
Still waiting for nvenc teams feedback, the worst case is adding the
'de-compensation' logic for 4:3 and 16:9

Agatha Hu

Philip Langdale

2015-01-22 17:22:32 UTC

On Thu, 22 Jan 2015 13:12:59 +0800

Post by Agatha Hu
Disabling compensation is the best result, adding a flag is
acceptable, but NVENC5.0 has just released, such option won't be
available until 6.0 is available, so...
Still waiting for nvenc teams feedback, the worst case is adding the
'de-compensation' logic for 4:3 and 16:9

If they can get a flag in eventually, that's still a good improvement.
We'll need the de-compensation logic in the short term, regardless of
anything else.

Thanks,

--phil

Reimar Döffinger

2015-01-22 04:41:11 UTC

Post by Reimar DÃ¶ffinger

While that will make the issue more rare, my problem description description above and the issue we have with it will not be fixed by that. In fact it will just make the work-around more messy.
nvenc would still be the only encoder to modify DAR in this way and DAR will change when people switch between x264 (or any other codec) and nvenc.
As a result, nvenc cannot be used as a drop-in replacement for x264 without a hack to undo your DAR modifications.
As said: it doesn't even matter whether what you do is correct or not, the fact that you are doing it while nobody else does is an issue in itself.
(if everyone agreed that it was clearly without question the right thing to do then changing all other encoder implementations would be an option, however that isn't the case and it would be quite some code duplication to do it in the encoders instead of a level or two above)

Reimar Döffinger

2015-01-18 17:41:09 UTC

Oh my god, this is so horribly broken I vote for doing a "return
-EBROKENAPI" and tell them we'll enable this when NVidia fixes their
stuff.
IMHO some features are not worth the hacks necessary.

compn

2015-01-19 00:16:55 UTC

On Sun, 18 Jan 2015 18:41:09 +0100

Post by Reimar DÃ¶ffinger
Oh my god, this is so horribly broken I vote for doing a "return
-EBROKENAPI" and tell them we'll enable this when NVidia fixes their
stuff.
IMHO some features are not worth the hacks necessary.

[08:51] <compn> just let nvidia change aspect, its their crap, let them
fix it

[08:57] <kierank> compn: the nvidia behaviour is correct
[08:57] <kierank> so no patches are necessary

-compn

Reimar Döffinger

2015-01-19 06:48:24 UTC

Post by compn
On Sun, 18 Jan 2015 18:41:09 +0100

[08:51] <compn> just let nvidia change aspect, its their crap, let them
fix it
[08:57] <kierank> compn: the nvidia behaviour is correct
[08:57] <kierank> so no patches are necessary

I don't really consider it "correct" if it doesn't do what a user expects, plus as I understood the discussion neither ffplay nor MPlayer nor VLC and probably WMP, flash player etc. will not show the video correctly if encoding e.g. a BluRay to DVD resolution via nvenc and playing it back.
If they do I'll have to reconsider, but if they don't it isn't working, "correct" or not.

Philip Langdale

2015-01-19 07:10:40 UTC

Post by Reimar DÃ¶ffinger

On Mon, 19 Jan 2015 07:48:24 +0100

Post by compn
On Sun, 18 Jan 2015 18:41:09 +0100

Post by Reimar DÃ¶ffinger
Oh my god, this is so horribly broken I vote for doing a "return
-EBROKENAPI" and tell them we'll enable this when NVidia fixes
their stuff.
IMHO some features are not worth the hacks necessary.

[08:51] <compn> just let nvidia change aspect, its their crap, let
them fix it
[08:57] <kierank> compn: the nvidia behaviour is correct
[08:57] <kierank> so no patches are necessary

I don't really consider it "correct" if it doesn't do what a user
expects, plus as I understood the discussion neither ffplay nor
MPlayer nor VLC and probably WMP, flash player etc. will not show the
video correctly if encoding e.g. a BluRay to DVD resolution via nvenc
and playing it back. If they do I'll have to reconsider, but if they
don't it isn't working, "correct" or not.

Right. And it's obviously wrong if you consider that trying to encode
any square pixel content at this size results in non-square output.

I'd also argue that bt.601 semantics come into play when you're
authoring the DVD, and any transcoding you do after that should
preserve the input characteristics - and that means the encoded SAR is
based on a 720 pixel width. Why on earth my transcoded file should have
a different SAR from the original when doing a digital->digital
conversion is beyond me.

--phil

Nicolas George

2015-01-19 13:42:31 UTC

Post by Philip Langdale
Right. And it's obviously wrong if you consider that trying to encode
any square pixel content at this size results in non-square output.
I'd also argue that bt.601 semantics come into play when you're
authoring the DVD, and any transcoding you do after that should
preserve the input characteristics - and that means the encoded SAR is
based on a 720 pixel width. Why on earth my transcoded file should have
a different SAR from the original when doing a digital->digital
conversion is beyond me.

I believe part of Kieran's point is: you do not encode to anamorphic 720×576
unless you are actually authoring a DVD, and in that case it should conform
to BT.601.

This makes sense: if you are starting with nice square pixels from HD
contents, why on Earth would you stretch them to an absurd resolution?

On the other hand, if you are authoring a DVD, you are encoding to MPEG-2,
not H.264, and therefore you are not using nvenc, so this is still absurd.

And of course, the fact that is behaves differently than every single other
encoder makes it a braindead design, even if the result is, in some regards,
considered correct.

Regards,

--
Nicolas George

Nicolas George

2015-01-19 13:33:52 UTC