A brief context: I watch videos from BaidUTube (name slightly altered just because) and my preferable way to do that is to grab video files with youtube-dl
in 720p quality so I can watch them later at my leisure, in the way I like (i.e. without a browser), and re-watch later even if it’s taken down. It works fine but in recent weeks I’ve noticed that some of the downloaded videos are unplayable. Of course this can be fixed by downloading it again in slightly different form (separate video stream and separate audio streams muxed locally, youtube-dl
can do that) but today I was annoyed enough to look at the problem.
In case it’s not obvious I’m talking about mp4 filed encoded and muxed at BaidUTube without any modifications by youtube-dl
which merely downloaded it. So, what’s the problem?
Essentially MP4 file contains header with metadata telling at which offset and which size are frames for each codec and the actual data is stored in mdat
atom. Not here. First you have lots of 12-byte sequenced 90 00 00 00 00 0X XX XX 00 02 XX XX
, then moof
atom (used in fragmented MP4) and then another mdat
. And another. I’ve tried to avoid streaming stuff but even to me it looks like somebody put all fragments prepared for HLS streaming into single MP4 file making an unplayable mess.
Overall this happens only on few random videos and probably most of the browsers would not pick it (since VP9 or VP10 in WebMKV is the suggested format) so I don’t expect it to be fixed. My theory is that they decided to roll a new version of encoding software with a broken muxer library or muxing mode. And if you ask “What were they thinking? You should run at least some tests to see if it encodes properly.”, one wise guy has an answer to you: they weren’t thinking about that, they were thinking when how long until the lunch break and then when it’s time to go home. This is the state of enterprise software and I have no reasons to believe the situation will ever improve.
And there’s a fact maybe related to it. Random files starting from 2019 maybe also show the marker “x264 – core 155 r2901 7d0ff22” in the encoded frames while most of the files have no markers at all. While I don’t think they violate the license it still looks strange that a company known for not admitting that it uses open-source projects (“for their own protection” as it was explained once) lets such marker slip through.
Well, that was an even more useless rant than usual.