-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decode itu35 closed captioning data #534
base: master
Are you sure you want to change the base?
Conversation
OMG. That is cool as hell. Replying here, rather than on original ticket, as requested. Even your quick and dirty is genuinely super-cool. I hardly expected a reply, let alone a proof of concept. [ And I am flattered that you have taken the time to reply. I'm genuinely blown away by
and while they are all worthy projects, Anyway, I digress. And enough flattery. Back on topic. I'm on Mac, and I have never compiled I have a very cool file with EIA-608 CC1/2/3/4 and EIA-708 Service 1/2, but without giving away what it is publicly, it is a reference file. Perhaps you could add me to a private repo and I can upload it to you, then you can nuke the repo? The file just can't leak out to the public 'net. Here's a mediainfo output samplesei.mediainfo.json.txt And here's the raw, untruncated json output from fq, extracted using:
I did run
and here is the dump, for what it is worth. I suspect that the first caption may not have much info in there. I have taken a look at your PR, and while I can't code myself, I believe that I can see what you have done. On your journey around various formats out there, you may find that ITU T.35 could end up useful as a generic lookup, irrelevant of Video or Closed Captions, but /formats/mpeg seems like a reasonable play-pen. I think your externalized naming convention is solid. I'll try to get a sample in all of:
Anyway, raw, untruncated json output from fq is attached, as well as a comparison with mediainfo. Meanwhile, I'll try to learn how to compile |
👍
Thanks for the kind words 😊 it's for me a very interesting and challenging (also frustrating at times :) ) project to work on and it has also been what i used to learn about things that i need for work and for hobby projects.
I think you summarized quite well part of the reason fq exist, the other part is just that i'm interested in programming languages (jq especially!) and all kind of text and binary formats and encodings. Also i work as a software engineer in a team doing ingestion and transcoding services at a quite big media streaming service originating from sweden (start with an s), so we have to deal with lots of broken, problematic and "challenging" media file all day long. So fq is extremely useful :) we of course use ffprobe, mediainfo and lots of other tools also when debugging and trying to understand things, but as you say it's very useful to see exact details. Also with fq you can quite quickly "re-implement" things in jq that ffmpeg does just to verify you understand or see how some heuristics it does works. You can see some snippet on the wiki https://github.com/wader/fq/wiki. Have some vague idea to create a mp4.jq-project with various isobmff, dash etc specific things.
😁
Once you have golang installed (maybe also git?) it should be more or less just one command to build and install your own verison.
Could you put on google drive etc and share a lin privately? my email is [email protected]
Don't think there is, i guess the most standard would be to produce an annexb stream with only SEI data? something like this might work:
avc_annexb decodes to an array with sync header separate so i use chunk(2) to create an arra with [[sync,nalu], ...] pairs fq 0.1.0 unfortunately uses truncated base64 strings for binary values (JSON can't safely encode binary data :( ), in the next release it will be changed to just a non-truncated string (stil not safe). But! you can add
Will have a look
Yeah i try to keep divide into separate formats as much as possible, without making it absurd, as they become reusable and also possible to use with -d or as jq functions. I'm a bit confused about the relationship between all these standard organizations... is half the work sometimes just to understand all the different aliases and finding where in the spec the good stuff is.
Would be great. Ok to add them or part of them as test files to the fq repo?
👍 let me know how it goes btw there are some presentations (with video an slides) about fq that might be useful and also shows how i work with https://github.com/wader/fq#presentations |
I used Google Drive to share a file to your gmail, containing both and H.262/mpeg2 (A/53 picture user data) and H.264/avc (SCTE-128 sei side data). Obviously, I'm more interested in H.264, but it will all make sense when you see the files.
I do not think you are alone. Following on from your PR, I did try to do some research on trying to find an authoritative reference for the US scoped So I did promise that I would compile/install and try to replicate. Install was as easy as you said... $ brew install golang
$ GOPROXY=direct go install github.com/wader/fq@sei-itu-t35
$ "$(go env GOPATH)"/bin/fq -v Ok, this is really cool... I'm now gonna run your code against this same very simple file containing just 608-CC1
OMG... That is cool. Lets try something fancy with a
That gives...
You can almost see the magic text
In terms of the decode table, simple 608 captions are always encoded as 2 byte words, so you could save some vertical real estate by putting
You could do:
That is so cool - and while this file contains only 608 CC1 data in the DTVCC payload (the 708 Service1 format is more complex than the legacy 608 format). I do need to learn how to do 7 bit parity with jq and work out how to group stuff together in 2-byte words when I'm using But congrats, man!!! You just wrote a EIA-608 CC1 closed captioning analyzer, at least for a simple CC1 file!!! I sent you that really complex file containing CC1/2/3/4 and Service1/2. One suggestion... Now I see it with my own eyes, I think I'm still in awe of your work. I'll not bug you for more unless you want, but if you want to take this even further into the depths of decoding 608 to text, I would be more than happy to help you, since you have now given me a really cool tool. I want to be respectful of your voluntary and generous time. But you can now tell your Swedish bosses that your awesome tool could now be used for debugging 608 data in H.264 in HLS and DASH, once the annex_b is extracted with FFmpeg. My mind is blown. tak, tak, tak! |
Thanks, not so great was that i had a typo in the email, should be [email protected] (now doubled chcked), sorry about that.
Ok thanks, will have look also. I usually try to make the symbolic value lowercase and snake_case (the thing called
🥳 (btw i'm working on a mpeg_ts decoder but it's very rough currently, can see it here https://github.com/wader/fq/tree/mpeg_ts_wip, build with @mpeg_ts_wip if you want to try)
Aha will have look at those, think i've seen this before and wondered how it works. Try this:
Also maybe good to know that
Currently not possible to do that but we will figure something out. Im thinking if this will produce huge decode tree for "real" files maybe it should be optional or maybe thee should be an option how decode the "cc" data.
See use of With some code to keep track of pts for the samples one could nearly write them out as SRT or something :)
Ah yes will have a look and clean that up, thanks
Same for me, i'm at awe with what jq can do and how damn well it seems to fit, i know it was nice... but this nice? and the combination to do the bit-streaming decoding in go and use jq for the more flexible and fancy stuff is very nice.
I'm up for it, maybe open a new issue if you want to dump some specs and ideas.
No problem, glad someone else is as passionate to understand these things as me :) Maybe i will answer a bit more sporadic the comes week(s), heading home to parents and whatnot... but i usually end up coding anyway. |
Got a little curious how 608 works:
The wikipedia article about it seems quite good https://en.wikipedia.org/wiki/EIA-608 but would be nice to get hands on the spec. As i read it 0x14 is load into CC1 (caption channel 1?), so load "/" then load "/", ....., but then 0x11 comes which seem to mean load following bytes as caption text until... something, next command byte? once we figure this i guess it would not be that hard (might regret this) to write some basic 608 decode in jq.. possibly could have it in go also, will see. |
Fixed. I updated the user_data_type_code mapping to show how i mean by symbolic and description. Now you for example can do |
Hey, did you send a new email with link to the file? Havent seen anything yet |
I have re-shared the link.
I had a think about this, and while SRT is a very readable format, it is then starting to get into the domain of conversion, rather than pure analysis. (A generic PTS > SMPTE timecode decode module could be cool for all kinds of formats though, not just packets, but timecode can be tricky). From a DTVCC / SCTE-128 / EIA-608 perspective, the file format that is used to represent EIA-608 is SCC, which looks like:
Anyone who deals with US Closed Captions will recognize this format of 2 byte words. There are three projects that extract 608>SCC data:
The challenge is that all of these have a habit of also interpreting the data during conversion from 608>SCC, and all three give different results on the same source. They all have a developer-focused debug mode, which is a little more absolute, but their end-user facing conversion is aimed at producing a usable SCC file, rather than displaying what is in there. And this was why I reached out for the SEI T.35 stuff in
But I thought I would share what the common SCC syntax looks like, because media professionals will recognize the 2-byte words format - and that may help at presentation level. Now that I have shared that reference file in google drive (sorry about that' totally missed it), you'll see how complex it gets with CC1/2/3/4 and Service1/2, all blended in together. That is kinda like a "worst case scenario" file. It is worth keeping as a reference, but will also help determine how best to present info to a user. The dream would be a filter that could display cc1/2/3/4 using My initial requirement was that I was encoding 608 data with libcaption, and they were not getting identified by mediainfo and were not getting displayed by VLC, but I knew they were in there because mpv could play 'em. So I wanted to see what libcaption was doing differently, hence the request for the format. It was a perfect example of the need for a ground-truth SEI decoder. (The conclusion is that libcaption is not perfect, mediainfo was looking for a particular data rate and vlc is vlc). While I have been shouting the virtues of |
I see, and looking at the EIA-608 spec it seems a bit more complicated then i expect, but we will see. I usually try to make the go decoder "present" the formats in as neutral way as possible and you can use jq to massage things into other formats, ex something like this will the the cc bytes and produce the hex bytes pairs format above (ignoring how the timestamp would be extracted): $ ffmpeg -loglevel warning -hide_banner -i ~/Downloads/testsrc.with608captions.ts -map '0:v:0' -codec:v 'copy' -bsf:v 'h264_mp4toannexb' -f 'h264' 'pipe:1' | go run . -r --decode avc_annexb 'grep_by(.nal_unit_type=="sei" and .sei.payload_type=="user_data_registered_itu_t_t35") | .sei.data.user_structure.user_data_type_structure.cc | map(.cc_data_1, .cc_data_2) | tobytes | to_hex | chunk(4) | join(" ") as $pairs | "01:02:53:14 \($pairs)\n"'
01:02:53:14 942f 942f 94ae 94ae 942c 942c
01:02:53:14 94ae 9420 9140 5468 e973 20e9 7320 6120 e361 70f4 e9ef 6e20 e96e 2043 4331 942f
01:02:53:14 942f 942f 94ae 94ae 942c 942c
01:02:53:14 94ae 9420 9140 c120 73e5 e3ef 6e64 20e3 6170 f4e9 ef6e 20e9 6e20 4343 31ae 942f
01:02:53:14 942f 942f 94ae 94ae 942c 942c So in some future one could put such function(s) in a scc.jq file etc and do
The files will be very useful and we will see what makes to do as go decode or as jq programs.
This was also one of the reason i really wanted jq (and JSON) as i've noticed that i wrote lots of script to turn differente debug outputs from other tools into JSON and then used jq to combine and query. BTW fq has a # -n tell fq to not automatically read/decode input files, will have to use input/inputs explicitly
# define a function f that finds the sps (assumes there is only one)
# use diff with input|f as both arguments, reads/decode next file and finds sps (note ";" is the argument sepeator in jq)
# object with a/b value is shown in the structure where things differ
fq -n 'def f: grep_by(format=="avc_sps"); diff(input|f;input|f)' file1.mp4 file2.mp4
{
"frame_cropping_flag": {
"a": true,
"b": false
},
"level_idc": {
"a": "3",
"b": "3.1"
}
} |
That command is genius, irrelevant of the timecode. Thank-you. It does exactly what I need - display the words in a format that I can use to validate the output of ccextractor, FFmpeg and caption-inspector to arbitrate what was actually transmitted in the 608. Wow.
And that diff is pretty cool. You have created an awesome tool for EIA-608 debugging, irrelevant of the timecode. I had to use tohex rather than to_hex, but that was irrelevant. You did it! I genuinely think that you have created a unique tool that allows a user to validate the output of the other 608 caption tools. I'll leave you alone and I'll go play with my captions files. Thank-you, sir.
|
Great! what is ${infile} in this case mp4 or ts? if it's mp4 i think you can already decode directly with fq if you want to skip going thru ffmpeg, maybe something like this:
or maybe do
Aha sorry about that,
No problem, happy play around! let me know how it goes and feel free to ask jq questions also, want to spread knowledge about it. And i do have lots to thank jq (and gojq that fq uses a modified version of) for making all this possible, i sometimes feel like nearly accidentally happen make it fit together with a bit stream decoder... but yeah it was quite a lot of work and thinking to make it happen :)
Hehe it's a good summary what fq is about :) and i hope more ppl will find it useful, and i think uses cases like yours shows very well what it's capable of. |
@bbgdzxng1 sorry the progress on this stalled, got stuck in other things. hope i will get back to this and mpeg ts, but i will try keep this PR rebased on master from time to time |
@wader. Mattias - You were kind enough to help me 18 months ago with this branch. I just wanted to thank you again - I've been using fq pretty much every few weeks as the need arises. The work that you did on this branch was super-human, and having been given your guidance, I have found that main-branch fq to be sufficient when inspecting annexb files containing SEI side data. With some of your tricks and commands, you have steered me in the right direction with my limited needs and limited skills. Of course, if you ever do feel the desire to add this DTVCC-parsing branch to mainstream, that would not be unwelcome, but I appreciate that you may not want branches like this hanging around indefinitely. If so, feel free to archive - I have been able to get along sufficiently for my needs with each new stable release of fq. If you ever want to get back to H.264 and media files, you know where to find me. It is great to see the fq project getting stronger and being recognized for the powerful tool that it is. I hope you remain well, Mattias. |
Nice to hear and nice to hear from you!
I would love to get parts of or all of these WIP branches merged somehow. Maybe the sei-itu-t35 stuff could be merged after some small polishing? the mpeg ts stuff is a bit more complex but maybe that could also be split into some more mergable parts? think i got a bit stuck on how to model things and also how to handle corrupt streams. BTW i just rebased both branches on top of master, seems to work fine.
Same! |
No description provided.