Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add restriction on 'clap' property #232

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

leo-barnes
Copy link
Collaborator

@leo-barnes leo-barnes commented Sep 11, 2024

This closes #188


Preview | Diff

Copy link
Collaborator

@y-guyon y-guyon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

It should be part of the "Changes" section.

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated

The semantics of the clean aperture property ('clap') as defined in [[!HEIF]] apply. In addition to the restrictions on transformative item property ordering specified in [[!MIAF]], the following restriction also applies:

<assert>The origin of the 'clap' item property shall be anchored to 0,0 (top-left) unless the full, un-cropped image item is included as a secondary item in an image collection.</assert>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative: essential shall be equal to 0 for a 'clap' item property. Readers shall treat a clap item property as non-essential even if essential is 1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would mean it's fine for a parser to always ignore the clap, even when the item is an input to some other derived item.

It would also mean it's fine for a parser to ignore a top-left anchored clap, which would lead to padding being shown for images that need the clap to deal with HW limitations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I was thinking about implementations that parse but ignore the clap values and that do not expose the secondary non-hidden item.

As an example there is the libavif API that decodes to an avifImage which has a clap field. That clap field is ignored by avifdec which always outputs a PNG containing the full, uncropped image. I guess this is not compliant but I can imagine others doing the same, just accessing the pixels and ignoring the multiple obscure metadata chunks.

So I would have liked to adapt the spec to just accept the current state of things, at least for items that do not have further transformations, but that is beyond this PR I guess.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. I see your point.

I think the real fix would be for avifdec to apply the crop when writing to PNG though. From what I understand from the conversation where Chrome resurrected support for clap, there was a real use-case where it was needed for some HW that had size limitations. The current behaviour from avifdec will decode and show the padding for those files which I don't think is the correct behaviour.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Chrome applying clap now? AOMediaCodec/libavif#1177 can be closed then.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Chrome applying clap now? AOMediaCodec/libavif#1177 can be closed then.

I think so at least:
chromium/chromium@3a13337

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index.bs Outdated

The semantics of the clean aperture property ('clap') as defined in [[!HEIF]] apply. In addition to the restrictions on transformative item property ordering specified in [[!MIAF]], the following restriction also applies:

<assert>The origin of the 'clap' item property shall be anchored to 0,0 (top-left) unless the full, un-cropped image item is included as a secondary item in an image collection.</assert>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said during the call, I would have liked stronger constraints such as dimensions but this is too complicated.

Copy link
Collaborator

@wantehchang wantehchang Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm that Chrome requires the clean aperture to be anchored at (0, 0) (top left) of the coded input image, and does not impose other requirements on the clean aperture.

We can modify Chrome to require that less than 64 pixels are cropped from the width or height of the coded input image. This will support encoders that require width or height to be a multiple of 64. After gaining experience with this constraint in Chrome, we can then add it to the AVIF spec.

Copy link
Collaborator Author

@leo-barnes leo-barnes Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wantehchang

We can modify Chrome to require that less than 64 pixels are cropped from the width or height of the coded input image. This will support encoders that require width or height to be a multiple of 64. After gaining experience with this constraint in Chrome, we can then add it to the AVIF spec.

I understand the want for having stronger constraints, but we've had stuff like that come back and bite us multiple times in the past. We currently know of no AV1 HW that requires larger alignment than 64, but for HEVC we have had HW that requires horizontal alignments of:

  • 64
  • 128
  • 192
  • 256

And vertical alignments of:

  • 64
  • 96
  • 128

Assuming that all future AV1 HW will stick to 64x64 sounds like a dangerous assumption to me.


Threat:
The main thing we want to protect against is some kind of editing service that promises to do lossless cropping of your image and that simply applies a clap to the file without the user knowing that the full image is still present.

Valid use cases we want to be able to support:

  • Dealing with HW limitations
  • Useful tool for more advanced capture concepts

From my memory of the discussion, everyone could agree on two mitigations that would more or less completely mitigate the threat while still supporting the valid use-cases:

  • Limit the clap to be anchored to the top-left corner. (This makes it useless as a general purpose cropping tool)

unless

  • The full un-cropped image is also present and visible in the file. (This ensures that if used for generic cropping, the user is made aware that the content is still present.)

@leo-barnes
Copy link
Collaborator Author

From meeting: Maybe we need to specify that the anchoring is top-left of input rather than output


The semantics of the clean aperture property ('clap') as defined in [[!HEIF]] apply. In addition to the restrictions on transformative item property ordering specified in [[!MIAF]], the following restriction also applies:

<assert>The origin of the 'clap' item property shall be anchored to 0,0 (top-left) of the input image unless the full, un-cropped image item is included as a secondary non-hidden image item.</assert>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leo, Yannis: Is it straightforward to check the second condition? It would be good to try implementing this check (at least mentally) to see how hard it is for compliant applications to enforce this requirement.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

From HEIF:

A hidden image item has (flags & 1) equal to 1 in its ItemInfoEntry.

So that it is easy to check.
The presence of a clap property and whether it covers the full image should be straightforward too.

However an image item can be hidden or not, and cropped or not, but these are properties tied to the item, so I do not see how to have a cropped version and a non-cropped version of the same image item. Is it why collections were mentioned at first? Can one crop a collection? I guess it could be done through iden for example but checking if the full image is available through derived image items sounds a bit of work.
Should it be "two image items referring to the same coded image bytes"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it could be done through iden for example but checking if the full image is available through derived image items sounds a bit of work.
Should it be "two image items referring to the same coded image bytes"?

The way I envision someone doing this is per NOTE 2 in HEIF section 6.6.2.1 Identity derivation:

NOTE 2 A derived image of type 'iden' can be used, for example, when it is desirable to
have both the original version of a coded image item and a cropped version of the same
coded image item (obtained through the 'clap' transformative item property) as non-
hidden image items.

There are other ways it could be done, but this feels like the most straight forward and is relatively easy for a parser to check.

If someone uses an alternative scheme to do it, I think that should be allowed by the spec. It might not necessarily be implemented in all parsers, but that is something that can later be implemented if it turns out to be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add restrictions on 'clap' item property
3 participants