Since I feel I’ve done enough (for a moment) on na_game_tool, I decided to work on the rather (always) neglected NihAV
. And specifically on the nihav-encoder
tool.
As you can guess from the name, it is a tool for re-encoding input file(s) into something else (or something the same). Since I don’t have much need in transcoding anything, I use it occasionally to test encoders (especially for collecting information by running encoding with different parameters on the selected sample set) or a muxer. Essentially if video is encoded correctly that’s enough, audio de-sync or even drop-outs (because not all encoded packets were actually muxed) is another thing. Finally I’ve decided to improve the situation and make it more usable.
First of all I’ve introduced muxer quirks, a set of muxer peculiarities that may affect what muxer expects for input. Currently there are three such quirks defined: constant framerate mode (i.e. muxer expects frames of each stream to have the same duration, which is common for old formats), fixed duration (i.e. muxer should know the output duration beforehand because it e.g. stores seek table for each frame before actual data start) and unsynchronised mode (i.e. while AVI would normally expect streams to be synchronised, more complex formats like MOV or WebMKV may group stream data into larger clusters; or null muxer does not need to care about synchronisation at all). Maybe it makes sense to add more quirks in the future but for now I can’t think of any.
This helped me to make some things in the encoder tool easier. For instance, I don’t have to pass --calc-len
for Bink-b encoding as the muxer reports such quirk and the duration is calculated automatically (in the simplest way possible that can’t work properly with variable framerate input but I’ll fix it when I really need it). Or more importantly the encoder learns about constant framerate in the muxer and converts framerate of input material for it. Additionally audio formats that allow different block length (like PCM or various ADPCM formats) can now be told to set block length corresponding to video framerate.
And that’s the second thing—proper (or at least better) synchronisation, especially with PCM audio. One of the nasty pitfalls about AVI is that you normally expect to report just sampling rate for PCM timebase and send PCM packets of the same length (the last one may be shorter though). That is why I need to cut it into equal-sized blocks and of size relative to video framerate. Otherwise players will complain about badly-interleaved AVI and audio will be screwed.
Then there’s another quality of life improvement—encoding profiles. Mind you, those are not profiles for the codec but rather for the whole output. But this is not a conventional output profile like “encode to DVD format” either. I define those per muxer so I can invoke e.g. nihav-encoder -i infile -o outfile.avi --profile lossless
and it will encode video stream(s) with ZMBV codec using settings for fast encoding and audio stream(s) will be encoded as PCM, while ms-lossy
profile will encode video stream(s) using MS Video 1 encoder and audio streams using MS ADPCM 4 encoder. And theoretically lossless
profile for FLV would encode to Flash Screen Video plus PCM. And of course it has weird defaults since ages: instead of declaring a randomly-chosen default encoders in a muxer, nihav-encoder
would either try to encode input into the expected output format if the muxer declares it supports only certain codec (like GIF or FLAC), if that fails it will try to copy stream, otherwise it will print an error and terminate, expecting me to provide the desired encoder (IMO it’s still better that encoding using a random codec with random parameters).
Now all is left to done (at least in medium-time perspective) is to backport OpenDML support from AVI muxer in na_game_tool
. They are far from being identical as NihAV
has merely AVI muxer while AVI writer in na_game_tool
not merely combines packets but forms output frames as well. Of course it’s easier there since I control all decoders there and can make them output perfectly interleaved audio and video with proper duration (in NihAV
decoders simply output what they are fed, which may lead to e.g. audio packets of uneven length).
Anyway, here’s a question with which you, a hypothetical reader of this post, can help me. I’m looking for a reasonable output format combination for interchange purposes (i.e. what would be a good combination of lossless or moderately lossy formats to have). With audio I can always resort to PCM but what to use for video (and container)? I’d like to have various formats supported (paletted and 15- or 24-bit RGB is a must, YUV is optional but nice to have). Currently I have ZMBV in AVI (and RLE in AVI might work fine as well) but if somebody can suggest anything better I’m all ears. It’s just most other containers can’t deal with palette changes (or in case of MOV it can, but nothing nowadays supports those video tracks with multiple descriptors) or rather obscure that you’ll have to search for a tool to convert them to something better anyway.
Suggestions are welcome.
MKV + FLAC(for audio) + SCPR(for rgb video) (1. or 2. version).
I never got time to write ScreenPressor (same SCPR above) video encoder, and it become apparently more/less open source by now.
That’s an interesting proposal. And if you switch video codec for FFV1 you can even argue it’s fully opensource and covered by specification. In general though I’d prefer something more widespread than SCPR and with native pal8 mode.