Archive for August, 2015

Random Thoughts on Format Design Process

Wednesday, August 12th, 2015

From my experience a lot of codecs have some wrong things in them and those things are usually introduced during codec creation. As for containers, I’ve expressed my opinion before.

It is very bad when some codec is being developed and then suddenly it’s declared released. You’re left with a pile of code that has somehow evolved into current shape and probably even the author has no idea how it works. Two examples — Snow and Speex. The first one is wavelet-based codec that performed quite well back in DiVX 3/4 days, the other one is a speech codec that also gained some popularity and was even included as one of Flash audio codecs. So the codecs by themselves should not be that bad but there’s only one implementation and no specification. There were several attempts to make Snow developer write a specification for it (for money!) but he always refused. FFV1 is faring somewhat better since it has some rudimentary specification and hopefully standardisation efforts will bring us independent implementations and full specification (yes, I’m an idiot optimist). What would be a proper way to design a codec in my opinion? Create test version, play with it till you achieve good result or release a known beta, write specification, throw away old code and reimplement version 1 from scratch. Repeat for version 2 etc.

I think I’ve complained before that this situation is very common with proprietary codecs too. They have inhouse encoder and decoder implementation with encoder bugs compensated in the decoder. Stupid motion compensation in RealVideo 4 is one of those “features”. Or pre-RTM WMV9 with its block pattern coding though it’s supposed to be beta anyway.

There is even worse case — when codec author decides to embed all development history into decoder maintaining backward compatibility. The worst offender is Monkey Audio with its subtle bitstream changes at every version and having two dozen versions. Another “good” example is HEVC with its ever-changing bitstream format. Different major versions of reference software introduce serious bitstream changes, like HM8 -> HM10 transition remapped all NAL IDs. IIRC superseded version of ITU H.265 was for 4:2:0 subsampling only. Honestly, I shan’t cry if this codec dies because of idiotic licensing terms (and maybe it should really be contained only in FLV). Speaking of HEVC idiocies, VP9 got new features in new profiles including 4:4:0 subsampling. In my opinion one should kill this creeping featurism especially if you don’t have proper profiling/versioning system and even them introduce new features sparingly.

At least there’s still hope for Daala to be developed properly.

NihAV — Guidelines

Saturday, August 8th, 2015

The weather here remains hellish so there’s little I can do besides suffering from it.

And yet I’ve spent two hours this morning to implement the main part of NihAV — set implementation. The other crucial part is options handling but I’ll postpone it for later since I can write proof of concept code without it.

Here’s a list of NihAV design guidelines I plan to follow:

  • Naming: component functions should be called na_component_create(), na_component_destroy(), na_component_do_something().
  • More generally, prefixing: public functions are prefixed with na_, public headers start with na as well e.g. libnadata/naset.h. The rest is private to the library. Even other NihAV libraries should use only public interfaces, otherwise you get ff_something and avpriv_that called from outside and in result you have MPlayer.
  • NihAV-specific error codes to be used everywhere. Having AVERROR(EWHATEVER) and AVERROR_WHATEVER together is ridiculous. Especially when you have to deal with some error codes being missing from some platform and other being nonportable (what if on nihOS they decided to map ENOMEM to 42? Can you trust error code returned by service run on some remote platform then?).

And here’s how actual set interface looks like:
[sourcecode language=”c”]
#ifndef NA_DATA_SET_H
#define NA_DATA_SET_H

#include “nacommon.h”

struct NASet *na_set_create(NALibrary *lib);
void na_set_destroy(struct NASet *set);
int na_set_add(struct NASet *set, const char *key, void *data);
void* na_set_get(struct NASet *set, const char *key);
void na_set_del(struct NASet *set, const char *key);

struct NASetIterator;

typedef struct NASetEntry {
const char *key;
void *data;
} NASetEntry;

struct NASetIterator* na_set_iterator_get(struct NASet *set);
void na_set_iterator_destroy(struct NASetIterator* it);
int na_set_iterator_next(struct NASetIterator* it, NASetEntry *entry);
void na_set_iterator_reset(struct NASetIterator* it);

#endif
[/sourcecode]

As you can see, it’s nothing special, just basic set (well, it’s really dictionary but NIH terminology applies to this project) manipulation functions plus an iterator to scan through it — quite useful for e.g. showing all options or invoking all registered parsers. Implementation wise it’s simple hash table with djb2 hash.