Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insufficient specification for CodecDelay and Cues #849

Open
mjbshaw opened this issue Oct 15, 2024 · 3 comments · May be fixed by #946
Open

Insufficient specification for CodecDelay and Cues #849

mjbshaw opened this issue Oct 15, 2024 · 3 comments · May be fixed by #946
Labels
clarifications errata-rfc9559 spec_main Main Matroska spec document target

Comments

@mjbshaw
Copy link
Contributor

mjbshaw commented Oct 15, 2024

CodecDelay states "This timestamp value MUST be subtracted from each frame timestamp in order to get the timestamp that will be actually played."

CueTime just says "Absolute timestamp of the seek point." But "absolute timestamp" can be interpreted in different ways. The interaction of these two elements is unclear and underspecified. Questions:

  1. Should CueTime's absolute timestamp value be Cluster.Timestamp + Block.Timestamp, or should it be Cluster.Timestamp + Block.Timestamp - CodecDelay? If the latter, should it also account for any potential DiscardPadding in a block group?
  2. Should the duration of the cue account for CodecDelay? If so, should it also account for any potential DiscardPadding in a block group?

In my experience cues have not accounted for CodecDelay or any DiscardPadding. Assuming this is correct, it may also be worth calling out that applications should account for CodecDelay (and DiscardPadding if feasible, though that's harder) when using CueTimes because it's pretty easy to accidentally skip it.

For example, let's say an Opus track is saved with 5 segments with each being 10 seconds long (ignoring any codec delay or padding) with CueTimes of 0s, 10s, 20s, 30s, and 40s. It would be technically incorrect for the application to blindly take these CueTimes and use them or present them to the user. The actual media times for these segments are be [0, 9.9935] for the first segment, [9.9935, 19.9935] for the second segment, etc. until [39.9935, 49.9935] for the final segment.

@robUx4 robUx4 added clarifications spec_main Main Matroska spec document target labels Oct 21, 2024
@robUx4
Copy link
Contributor

robUx4 commented Oct 21, 2024

Indeed, CueTime is underspecified. I'm not sure which one it should be. CodecDelay is very specific to Opus, as other codecs don't bleed their internals like that. It would be interresting to check how common muxers (mkvmerge and libavformat) generate the CueTime values. Because anything we may theorize here will end up having to be (backward) compatible with them. Hopefully they work the same way...

Without looking at the code my guess is that these codes were designed before CodecDelay was added and they don't account for it. Same thing for DiscardPadding.
There's also the possibility they don't generate Cues for audio tracks or maybe once per Cluster for audio-only files.
If we decide that the CodecDelay should be in the CueTime when it is not, some players may fail to look for a Block with the timestamp given by the CuePoint.

Also CodecDelay is there because if you want to play the audio at time t you need to start decoding the audio stream at t - CodecDelay. In Matroska we store the value of t in the Block. To which CodecDelay will be substracted on playback.

The proper seeking would be to a Block with a timestamp of t - CodecDelay. But there might not be such a Block because the duration of (Opus) Blocks/frames may not be a multiple of CodecDelay. So the seeking method needs to be aware of the CodecDelay to seek to the appropriate Block (at t - CodecDelay - some extra time). And because of that we don't need to store the CodecDelay in the CueTime.

So I think we reached the same conclusion:

In my experience cues have not accounted for CodecDelay or any DiscardPadding. Assuming this is correct, it may also be worth calling out that applications should account for CodecDelay (and DiscardPadding if feasible, though that's harder) when using CueTimes because it's pretty easy to accidentally skip it.

The same thing happens with SeekPreRoll which, as I understand it, is like a CodecDelay but in the opposite direction. Given the name it also needs to be taken care of when seeking.

We should probably add a paragraph in the Cues section to mention how CodecDelay, SeekPreRoll and DiscardPadding might influence the seeking.

@robUx4
Copy link
Contributor

robUx4 commented Jan 26, 2025

Looking at existing code:

  • VLC doesn't support CodecDelay or DiscardPadding
  • mkvtoolnix doesn't use CodecDelay or DiscardPadding (just copy on the output if they exist)
  • libavformat supports CodecDelay and removes the value from all timestamps of the track, it is removed from the cue values as done in this commit which is added through mkv_add_cuepoint(). But the timestamp where ts_offset is added the value was subtracted from the timestamp when reading the block. So it acts as if the CodecDelay didn't exist when writing the Cue entry (cc @mkver). It also doesn't use the CodecDelay when reading the Cue entries.

So it seems we all agree that it should not be used in Cues. Now we have to make proper text to make it explicit.

@mbunkus
Copy link
Contributor

mbunkus commented Jan 26, 2025

mkvtoolnix doesn't use CodecDelay or DiscardPadding (just copy on the output if they exist)

It'll copy existing values from Matroska files but also sets them for Opus tracks. There are bug reports that request I also use them for other codecs (mostly DiscardPadding).

robUx4 added a commit to robUx4/matroska-specification that referenced this issue Jan 26, 2025
No CodecDelay, DiscardPadding or SeekPreRoll.

Fixes ietf-wg-cellar#849

Once merged an errata should be sent to https://www.rfc-editor.org/errata.php#reportnew
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarifications errata-rfc9559 spec_main Main Matroska spec document target
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

3 participants