Archive for the ‘Useless Rants’ Category

Some notes on codecs I’d like to RE but don’t have time to do so

Saturday, August 27th, 2011

There are some codecs that I’d like to RE (mostly for completeness sake) but I don’t have time for that.

Intel Audio Codec

This one seems to be a lot like its predecessor IMC (Intel Music Codec), it even codes coefficients the same way but with different codebooks. I’ve tried to hack IMC decoder to make it use proper tables but it still decodes garbage.

Along with Indeo4 decoder it would make our Intel codec family complete, but unfortunately we have decoder for neither.

ClearVideo

The codec that was present in AVI, QT and RealMedia. My investigations showed that it was not-so-fractal codec, it still codes blocks with DCT and even does that in simpler fashion than H.263. Though a patent assigned to Iterated System describes what can be the base of this codec: DCT-based codec that uses fractal search to determine the best code for the current block or something like this. Maybe that’s the reason why there are no Huffman tables in decoder while it obviously uses some.

RALF

That’s rather special lossless codec that stands aside from other RealMedia codecs: the file format was altered for that codec (so far I’ve seen only standalone RALF files, not, say, RV40+RALF).

It looks like the codec is rather simple and employs context-dependent codes instead of generic ones. I remember finding about eight hundred static Huffman tables in decoder for that purpose.

Also codec developers were very grateful to their source of inspiration, that’s why codec IS is “LSD:”.

WMA Lossless

Nothing much to say about it. As I remember, it uses infinite impulse response filters for compression and least squares method for finding (and maybe updating) filter coefficients. Should be not so hard to RE but nobody bothered so far.

M$ Screen 1 and 2 (aka WM Screen)

I’ve dabbled in REing MSS1, not MSS2 (which was later relabeled as WM Screen) but they should be related.

MSS1 was rather simple screen codec based on classic arithmetic coding (with adaptive models too IIRC) and binary partitioning. So decoding process was simple: get point for subframe division (horizontal and vertical) and modes for decoding those partitions (fill, skip, subdivide).

VoxWare MetaSound

This codec is obviously based on TwinVQ, it even has similar huge tables for different samplerates and bitrates and I found almost the same header reading code.


In conclusion I want to say that if somebody wants to RE those codecs he’ll be more than welcome (especially for Apple ProRes but I don’t care about it much).

A bit about soft drinks

Saturday, June 11th, 2011

I’m rather picky person, so I don’t drink alcohol, try to avoid drinking Coca-Cola or Pepsi and hate still water (especially Danish one). So here’s my review of what I could drink in different countries.

Ukraine

There are many different soft drinks, mostly of mediocre quality, but some are quite good. There are some good mineral waters too (the best one is hard to find outside the region where I lived, sometimes it’s hard to find there as well).

And usually drinks are sold in all varieties of bottles — from 0.5l to 2l

Germany

Mostly you get here is very good selection of mineral water, Apfelschorle and variations of Spezi (aka tyska oriktiga Trocadero). There are also some strange flavours like cherry+ginseng or bitter lemon (tastes mostly like lemon skin). Oh, and 0.5l bottles for those drinks made by Coca Cola company (seems Germans prefer local inventions to their main product) look like designed by Norwegian, the bottle is made from too thick plastic and maybe it was designed in 3D CAD without Bezier curves support.

Switzerland

Mostly the same as in Germany but with more pathos and higher prices (while in Austria Apfelschorle is called something like “sprudel Apfelsaft”, in Switzerland I’ve seen “Shorley”). That goes for most Swiss products anyway.

Denmark

Those people seem to hate carbonated mineral water (usually Danish mineral water with gas has only two bubbles to distinguish it from still water) and the only time I tasted their drink it was too sugary.

Belgium

Looks like it’s better to buy orange juice there instead.

Finland

Very good selection of drinks.

Norway

Limited but rather decent selection of drinks. The bottles look like they were made from single piece of plastic mostly with axe.

Sweden

One of the reasons I love Sweden. Excellent selection of drinks, including special seasonal ones (Julmust and Påskmust). Here’s an example of Påskmust:

img_7247

And of course, there’s the ultimate drink (IMO):
The Trollcadero

There are about eight different breweries producing it, I have tried it all except for two breweries.
And I have tried all but one soft drinks from Vasa Bryggeri. Probably I should go to Norrland again.

As for mineral water, they have Ramlösa, good water from Bergslagen region and even from the tap in many regions (it’s drinkable everywhere in Sweden, but tastes especially good in some places).

What happened to FFmpeg

Wednesday, March 30th, 2011

This is my look on what happened but I’ll try to remain objective.

A bit about me (in the very unlikely case you don’t know already and care). I’ve learned about FFmpeg in 2004 or so, just saw it along with other packages in Mandrake release. For several months I downloaded source snapshots at cybercafe (even dial-up was impossible then). I had long interest in general data compression methods and some interest in codecs sparkled by XAnim and desire to play M$ ADPCM files on Linux and FFmpeg got new decoders every week or so (mostly for packed YUV formats but nevertheless quite useful).

One day I tried to reverse-engineer some codec (just for fun), looked at sample produced by TechSmith Camtasia and realized that it’s packed with zlib and after some time guessed correctly they use M$ RLE. In order to test it I wrote a decoder and hacked it into FFmpeg. Eventually it worked and I send my decoder to Mike Melanson. On the 14th of August 2004 it was committed to FFmpeg codebase and it made me proud for my work for a week (those were the times!). After another decoder or two I’ve learned and started to read ffmpeg-devel mailing list (it was on SourceForget then). I think I started submitting my patches there with Indeo2 decoder or so.

After a while I was offered CVS commit access which I refused because of technical limitations. Finally in March 2006 I got display for my MacMini and I was ready for more active development. Google Summers of Code gave me opportunity to dedicate a bit more time for FFmpeg since I could say “hey, I’m payed for it!”. I still try to contribute even if I’m no longer student, have job and too little free time.

And now to the business.

As you may know, most active group of developers had disagreements on how FFmpeg was managed. First that resulted into an attempt to move old-style development elsewhere and reinstating new development under old name. Since Fabrice was in favour of old group and controls ffmpeg.org DNS entry, new model development group was forced out and now is known under a name of Libav.

But what is the root of disagreements? The “legendary” leader, Michael Niedermayer. Legendary in the sence that it’s a legend and not a reality.

At least since 2004 (when I joined the project) FFmpeg was rather a self-organizing community of developers, each with his own goals. Somebody wanted to play movie trailers encoded with QuickTime (hi there, Mike!), somebody wanted to play obscure game formats, somebody just wanted to support anything that he could reverse engineer (that’s me and probably Mike and other people as well). Diego Biurrun tried to bring project in shape by introducing formatting conventions (in early days nobody cared about style much), he and later Måns Rullgård made FFmpeg build system almost perfect, also Måns and Baptiste Coudurier (and many other people) worked on improving or introducing support for common formats.

Later when FFmpeg started participating in GSoCs, at first it was handled by Mike and now by Ronald Bultje. Our test system — FATE started as Mike’s experiment for automated testing regressions for many parts. Later it was completely redesigned and rewritten by Måns who also used a lot of his own hardware to provide test results so FFmpeg was tested on variety of platforms and compilers (most non-x86 things at our FATE are because of his work).

Bug tracking system was set up by Luca and he also found a hosting for it. A lot of services for FFmpeg were run on hardware of Attila Kinali (and even bandwidth and hosting for main server was his achievement). And recent Subversion -> Git transition with merging history from SwScaler is mostly done by Janne Grunau.

So, what’s the role of leader in FFmpeg? None! Almost every significant action was done by somebody else. Were they following some roadmap devised by him? There is no such thing either. Maybe it’s his social skills that kept community together? Wrong again, he caused some people to leave project (and not only the last year, Baptiste would serve good example) and different service maintainers too — by forcing his idiosyncrasies on project (like long-standing DTS guessing issue) or ordering service maintainers around.

And his role as lead developer has been diminishing probably since 2004. I can’t deny he did outstanding work on optimising H.263-based encoders and decoders and writing H.264 decoder, writing and developing some other stuff and providing reviews for patches. But what does he do in recent time? I can’t name anything significant. And from technical point he can’t serve as example: he never cared much about architectures beside x86 nor about his code being easily understandable.

Thus, some developers had had enough and forked. It’s still self-organized community with people contributing to what they seem important and nobody to order around (and not that much stalling on patch reviewing like in times of designated maintainers either).

This fork seems to moved murky waters and some trolls (mostly from MPlayer project that have no relation to FFmpeg at all) reappeared after long time; I cannot directly blame Michael on it but it seems suspicious for me. And the messages I’ve read on ffmpeg-devel between forking and creating Libav made me mostly disgusted, so I’ve unsubscribed from FFmpeg mailing lists and don’t participate in FFmpeg anymore. What goes there is not my concern anymore and I’m happy with Libav.

P.S. Also since most of new things in FFmpeg were introduced despite of him (like Git transition and releases), I can’t forget one historical analogy. In German “the leader” is “der Führer”, but that word is rarely used nowadays because there was another Austrian who completely spoiled its meaning.

Politically incorrect sayings about some European countries

Sunday, February 13th, 2011

I have a few harsh words about the countries I’ve visited and I think it’s a good time to present them (well, not worse than any other time).

Germany

Generally it’s a nice country except for big (by German standards) cities. Though I’d like their trains to be more punctual (yes, people abroad think all German trains are always in time while in reality even bahn.de has special mark for trains they suppose to be punctual). And I apologize to Turks who live in Germany but I believe the biggest problem with emigrants in Germany is with people from ex-USSR countries.

Switzerland

People there have bigger ego than Argentinians. First thing I noticed is their national symbolic on locomotives — I don’t know any other country that does it. And their products often have special ingredients like “Swiss milk” or “Swiss beef” (but they are equally expensive even without them). Too bad that no Swiss chocolate contains cocoa grown in Switzerland.

Speaking about products, their prices make me think rösti was invented by/for hundred foreigners who could afford only one frying pan of potatoes together.

Also I find that all Swiss products are overhyped — is any of them much better than elsewhere? Clocks – you have good ones in Japan or Germany (and even cuckoo clock was not invented in Switzerland). Cheeses — again, there are much better ones in other countries, notably Netherlands and Sweden (the latter just doesn’t produce or export much). Chocolate — French, Italian or Swedish chocolate is not worse.

Norway

I’ve been only to Oslo but from what I’ve seen people there should invest money they get from oil into inviting people with sense of taste. All nice buildings in Oslo (yes, all twelve or so) were build by Swedish or Danish architects. And please make sculpting a major criminal offense (at least before you learn the meaning of “lagom” from your neighbours).

France

It doesn’t matter what I think about them because French people (and their ticket vending machines) seem to completely ignore anybody not speaking French (that includes me).

Belgium

From what I saw I conclude they are trying to build another Ukraine (and they are succeeding at it). It’s not hard, just forget that you should maintain buildings, clean streets or that plain concrete is not the best decorating material. Brussels would probably be better had it been built by architects from Oslo.

Douglas Adams was absolutely right.

Sweden

When I heard Russian pop in Skärholmen mall I realized that Skärholmen is, indeed, not a part of Sweden anymore (I was told so before but found it hard to believe, not anymore).

But my main concern is that there’s no proper E4 rail equivalent so I could not travel from Sundsvall to Umeå or Luleå directly (and I’d like less travelling time on route Stockholm-Sundsvall too).

And I wonder what’s better, lutefisk in winter or surströmming in summer?

FOSDEM

Wednesday, February 9th, 2011

I’ve attended FOSDEM-2011 solely because of friends. I thought that Belgium is a sad country and Brussels is shitty but it turned out to be even worse than Ukraine in perception so I’m not sure if I’ll visit it again (Berlin is much nicer by comparison even if it still has Ukrainishy feeling).

av500 advertises his new 15'' laptop

av500 advertises his new 15'' laptop

Even if I spent not so much time there it’s always nice to see fellow FFmpeg developers and other related people — VLC people, BeagleBoard people, the small guy on the photo.

And shortly after that I’ve managed to get a new decoder accepted (the one used in Wing Commander IV). It was in RE since 2002 or 2003 but people were too lazy to actually complete it. But now it may serve as a memory that while I was at FOSDEM it was reviewed. Hopefully we’ll get more codecs to come even if not from me.

Again about my favourite country

Sunday, January 9th, 2011

There’s Russian saying “You’ll live a year in a way you meet it”, so just in case I tried to get to my favourite country. Mostly to check how it’s in winter time.

If somebody still thinks I’m normal, here’s a picture of what I mostly drink during my stay there:

julmust

(those are different bottles, I’ve tried more few more kinds too). In addition to that I’ve tried few more drinks from certain brewery i Sundsvall. And I’ve finally tried Norrlands national drink in solid form. Somehow these trips always get full support from my stomach.

P.S. Ost, inlagd sill, gravad lax, köttbullar och tunnbröd med julskinka är lagom bra. Och jag har älgkorv att prova.

P.P.S. Trains there seem to be as punctual as in Germany.

The biggest curse in codec design

Sunday, November 28th, 2010

This post is an answer to the comment by Alex Converse on my previous post:

It’s interesting how quickly you dismiss SLS for being a hybrid coder with AAC. From a pure lossless standpoint that is a weakness but from a broader perspective it allows for a lossy layer that is widely compatible with existing hardware.

Let’s see why scalable coding is a weakness from lossless coding standpoint.

There are few hybrid lossy+lossless codecs out there which use lossy part in lossless reconstruction, e.g. MPEG-4 SLS, DTS-HD MA and WavPack. First two use two different coding techniques – MDCT or QMF for core coding and usual lossless coding for difference. In WavPack both parts are coded in the same way and correction data is stored in different block. For DCT-based codecs there are many ways of performing DCT (from trivial matrix multiplication to FFT-based implementation to decomposing DCT into smaller size DCTs) which may lead to slightly different output depending on method chosen. Thus, you should have a reference way (i.e. not the fastest one) of doing lossy stuff or you can’t guarantee truly lossless reconstruction. Also residue (the difference between original and lossy coded signal) tends to be more chaotic in this case and thus less compressible.

Another issue is what to do with correction data. If you put it into a separate file, you will have more troubles since you have to manage two files; if you put it all into single file, its size will be bigger than pure lossless coded file (unless you have very stupid method of lossless coding).

And now comes an argument that I really hate: “but it allows legacy players handle those files”. That, in my opinion, is this post title. Making it backward compatible just cripples it. In that case you need to implement new (and sometimes completely different features) in old limits and relying to new technology. So in some case it just degrades quality and/or forces you to encode something twice — for old feature set and its replacement. Another reason is that it just delays that codec adoption: old player can play it so why should I bother about this new codec support? I suspect this was a reason why we have MLP but no DTS-HD support.

The worst offender here is MP3. This codec sucks by design. It uses 36-point (or three 12-point) MDCTs which are not trivial to speed-up unlike power-of-two transforms and the output of MDCTs is used as input to QMF used in MPEG Audio layers I&II, as it claimed “to be compatible with them”. As claimed here, MP3 would perform better and since it comes from one of the leading LAME developers, I believe it. And of course MP3Pro. Most players in existence just ignore extension part and play crippled version of sound. Someone may argue that’s because it’s proprietary. Okay, look at HE-AAC where SBR is documented at least, it may still cause some confusion since it may be detected only when decoding audio frame.

In my opinion both implementing new codec support and special codec extension in general case is just single-time action with comparable effort (hacking existing code for new extension support and detection may be not that easy). And thus, adding a new codec should be preferred. MPEG-2 introduced both AAC (please look how it was called back then) and multichannel extensions to layers I-III. Guess which one works better?

Why Lossless Audio Codecs generally suck

Saturday, November 27th, 2010

Why there are so many lossless audio codecs? Mike, obviously, had his thoughts on that subject and I agree with my another friend who said: “it’s just too easy to create lossless audio codec, that’s why everybody creates his own”.

Well, theory is simple: you remove redundancy from samples by predicting their values and code the residue. Coding is usually done with Rice codes or some combination of Rice codes and an additional coder — for zero runs or for finer coding of Rice codes. Prediction may be done in two major ways: FIR filters (some fixed prediction filters or LPC) or IIR filters (personally I call those “CPU eaters” for certain property of codecs using it). And of course they always invent their own container (I think in most cases that’s because they are too stupid to implement even minimal support for some existing container or even to think how to fit it into one).

Let’s iterate through the list of better-known lossless audio codecs.

  1. ALAC (by Apple) — nothing remarkable, they just needed to fit something like FLAC into MOV so their players can handle it
  2. Bonk— one of the first lossless/lossy codecs, nobody cares about it anymore. Some FFmpeg developers had intent to enhance it but nothing substantial has been done. You can still find that “effort” as Sonic codec in libavcodec.
  3. DTS-HD MA — it may employ both FIR and IIR prediction and uses Rice codes but they totally screwed bitstream format. Not to mention there’s no openly available documentation for it.
  4. FLAC — the codec itself is good: it’s extremely fast and features good compression ratios. The only bad thing about it is that it’s too hard to seek properly in it since there’s no proper frame header and you can just hope that that combination of bits and CRC are not false positive.
  5. G.711.0 — have you ever heard about it? That’s its problem: nobody cares and nobody even tries to use it.
  6. MLP/Dolby True-HD — it seems to be rather simple and it exists solely because there was no standardised lossless audio codec for DVD.
  7. Monkey’s Audio — well, the only good thing about is that it does not seem to be actively developed anymore.
  8. MPEG-4 ALS — the same problem: it may be standardised but nobody cares about it.
  9. MPEG-4 SLS — even worse since you need bitexact AAC decoder to make it work.
  10. OggSquish — luckily, it’s buried for good but it also spawned one of the worst container formats possible which still lives. And looking at original source of it one should not wonder why.
  11. RealAudio Lossless Format — I always say it was named after its main developer Ralph Wiggum. This codec is very special — they had to modify RM container format specially for it. A quick look inside showed that they use more than 800 (yes, more than eighty hundred) Huffman tables, most of them with several hundreds of codes (about 400 in average). That reminds me of RealVideo 4 with its above-the-average number of tables for context-dependant coding.
  12. Shorten — one of the first lossless audio codecs. Hardly anyone remembers it nowadays.
  13. TAK — it was originally called YALAC (yet another lossless audio codec) for a reason. Since it’s closed-source and fortunately not widespread (though some idiots use it for CD rip releases), it just annoys me time from time but I don’t think someone will work on adding support for it in FFmpeg.
  14. TrueAudio (TTA) — I can say anything about it except it seems to be quite widespread and it works. Looks like they’re still alive and work on TTA2 but who cares?
  15. WavPack — that’s rather good codec with sane bitstream format too. Looks like its author invested some time in its design. Also he sent patches to implement some missing features in our decoder (thank you for that!).
  16. WMA Lossless — from what I know, it uses IIR filter based on least minimum squares method for finding its coefficients. It has two peculiarities: that filter is also used for inter-channel decorrelation and bitstream format follows WMA9 format, i.e. it has something like interframes and frame data starting at arbitrary point (hello, MP3!).

P.S. I still hope this post won’t encourage anybody to write yet another useless lossless audio decoder.

How to Design A Perfectly Awful Codec

Saturday, November 13th, 2010

A quick glance on some codec disassembly inspired me to write this post.

So today I talk about how to design perfectly awful codec (from FFmpeg decoder implementer’s point of view). Since audio and video codecs usually have some specific methods and approaches to design, it will be presented in two parts.

Video Codec Design (and why we don’t have a decoder for this codec in FFmpeg)

  • Don’t care about portability. The “best” example is Lagarith — lossless video codec that uses floating point variable for arithmetic coder state. Thus, decoding it on anything but x86 requires an 8087 emulator.
  • Tie it to specific API or OS. The codec mentioned at the beginning provides the best example: it stores actually a sequence of GDI commands for frame data. While storing, say, VNC protocol record may provide good lossless compression, it should be self-sufficient (i.e. it should not require external data). M$ Camcorder Video however has (and uses!) such wonderful commands as “draw text with provided font parameters (including font name)”. Thanks, I’m not going to work on decoder for that, ask those guys instead.
  • Use lots of data. It really pisses decoder developer when you have to deal with lots of tables, especially with non-obvious structure. Special thanks to RealVideo 3 and 4 which stored variable-length codes data in three ways and about a hundred of codebooks.
  • Use your own format. That one annoys users as well. Isn’t it nice when your video is stored in videofile.wtf that can be played only with provided player (and who knows if it can be converted at all). Sometimes this has its reasons — for game formats, for example — though this makes life of decoder developer a bit harder.

Audio Codec Design (and why nobody cares about this codec)

Let’s repeat last two items:

  • Use lots of data. Yes, there are codecs that use lots of tables during decoding. The best supporters of this policy are DTS (they even decided to skip tables with more than ten thousand elements in ETSI specification, extensions require few more tables) and TwinVQ/VQF that has even more tables.
  • Use your own format. Audio codec authors like to invent new formats that can be used only with their codecs. There is one example when such container format was extended to store other codecs as well. That’s infamous Ogg. If you think it’s nice then try implementing demuxer for it from the scratch.

But wait, there are more tricks!

  • Containers are overrated. The best example is Musepack SV7 and earlier. That codec is known to store frames continuously and when I say “continuously”, I mean it — if one frame ends inside byte, new frame starts from the next bit. And the only way to know frame size is to decode it. And if your file is corrupted in the middle, the rest of it would be undecodable. A mild version of this is MPEG audio layer-III which stores audio data disregarding actual frame boundaries.
  • Really tie codec to container. That would be Musepack SV8 now. This time they’ve designed almost sane container with only one small catch — last frame actually encodes less samples and the only way to know that would be to make demuxer somehow signal decoder number of samples to decode for each frame. If you don’t do that, you may unexpectedly get some nasty decoding errors.
  • Change bitstream format often. If you throw out backward compatibility you may end with many decoders needed for each case. An example is CELT — it’s still experimental and changes bitstream format often, thus storing files in that format would be just silly since next version of decoder won’t be able to read them.
  • Hack extensions into bitstream. Some codecs are known to contain extension data inside frame data for “backwards compatibility” so decoders usually have hard time finding it and verifying it’s really expected extension data instead of some garbage. Well-known examples are MP3Pro and DTS (which took it to extreme — there are extensions for both frequency and additional channels that can be present simultaneously; luckily, DTS-HD has it more structured inside an extension frame data).
  • Make it unsuitable for general uses. For example, make codec take unbounded or potentially too large amounts of memory (Ogg Vorbis does that) or
  • Make codec like a synonym for reference implementation. It’s good when you just make only one implementation and just change it in many subtle ways so later you need to reverse engineer the source to get specification. That was the case with binary M$ Office formats and it seems to be the case with Speex (at least I heard so).

And finally, The Lossless Audio Codec to serve an example to them all. As Måns put it, “wherever your talk about bad design of codecs, there’s always Monkey’s Audio”. Let’s see its advantages:

  • container — it has custom container of course. And there’s one minor detail: it packs data into 32-bit little-endian words and frame may start at any byte of that word. This makes it somehow combine both approaches to containers.
  • Bitstream format changes — check. It is known to have a lot of small tweaks making bitstream format incompatible. Some of them are actually container-related though.
  • Unusable for general uses — well, it’s famous for requiring more CPU power to decode than most low-resolution (up to SD) H.264 video streams.
  • One codec, one implementation — for a long time it was so until some Rockbox developer REd the source code and wrote his own decoder (FFmpeg decoder is derived from it). Also for quite a long time it was supported only on Windows. And it doesn’t support older versions — nobody bothered to extend support for them.

Why I love Sweden

Saturday, September 18th, 2010

I was lucky to have a short visit to my homeland (look at this blog title if you didn’t guess it) and since some people ask why I love it, I decided to write this blog post.

Disclaimer: this is my highly subjective opinion on why Sweden is the best country (for me).

There are several points which I present below.
(more…)