All codecs roads lead to FFmpeg

February 6th, 2010

This is written mainly as a response to some flamewars.

All codecs may be divided into two categories — mature codecs and developing codecs. In the first case we have frozen bitstream format and not so many enhancements to codebase supporting that codec. In the second case we have codec that may change bitstream format and (what is quite important) encoder features.

FFmpeg itself went that way — from highly experimental H.26x encoder and decoder to rather stable set of almost all decoders available around and several encoders. Since there are some coding rules and conventions and existing framework in it, it makes it very convenient place to implement decoders — you can reuse a lot of code optimised for many platforms (so you don’t have to care about DCT speed, for example) and users don’t have to worry about adding new decoding interface and new lines in configure script, it’s all handled inside libavcodec. And “NIH syndrome” also gives a benefit here — you don’t have to worry about additional libraries (and original codec devs will have their codec specs tested as well).
You know the other advantages of this approach too.

In the same time those features make FFmpeg a bad place for having still evolving encoders for they are not likely to fit into existing framework so easy. The best this tension could be viewed in our interaction with certain encoder. They constantly modify this encoder, so existing FFmpeg options and presets are not good for them and it’s hard to tell how well it will work. Now let’s see what happens if x264 code will be merged into FFmpeg. It will put a rather harsh constraint on x264 developers because it’s hard to tell what change breaks other codecs (changes behaviour, whatever) or vice versa. The same applies to codec-specific features (like muxer using some encoder information, think H.264+MPEG-TS).

On the other hand, it is much easier to incorporate into FFmpeg an encoder not changing so much — some compromises should be made on common interface, some parts replaced with standard FFmpeg routines and voila!

I think that’s the reason we have a lot of decoders and not so many lossy encoders (especially not so many lossy encoders with good quality) in last N years. And it’s the reason why encoder should be originated as standalone projects and merged when they are stable. I’d also like to note that FFmpeg has standing issues with providing better framework for non-H.261 based codecs and descendants (where is codec-independent rate control and motion estimation?), maybe this affected Snow development as well. Anyway, let’s live and see how all these things will be resolved.

Looking for a job in a civilised country

January 25th, 2010

So, I’ve dropped out of my university because I see no use of continuing postgraduate studies I had. Now I’m a free man and should look for a means of living.

My main issue is that I live in third-world country with all consequences it can offer — no suitable work here for me (I’m picky and don’t want to learn PHP and “code” websites or do the same in Java) and no mobility (I don’t see an easy way to move into civilised country; if there was any, who would stay here?). Funny fact: worker salaries here seems to be lower than in China but prices for almost everything but food seems to be European.

Because of that my chances on getting employed by large company abroad are rather slim (or non-existent), so I hope that lesser company can invite me to work abroad. I’d gladly provide my skills and work on almost anything.

Here’s my list of countries I’d like to live and work in:

  • Tier 0 — Sweden
  • Tier 1 — the rest of Scandinavia
  • Tier 2 — any Western European country in Schengen area except warm ones (I feed bad when the temperature tops 25ºC but cold weather is fine)
  • Tier 3 — Canada (maybe the only developed country that welcomes Ukrainians)

If somebody can help me with fulfilling my dream I’d be very grateful. Even useful and not too general advice counts, but not the ones that require lying! Thanks.

Short CV:

  • got bachelor degree in CS and master degree in something, diploma says I’m “an engineer, system analyst”
  • more than 10 years of C experience; varying experience of different platform assemblers (x86, PowerPC, ARM, MIPS) — mostly SIMD for non-x86. I know some other languages too — C++, Pascal, Java, some scripting languages (shells, Perl, Python). I’ve tried functional languages too (Lisp, Prolog, Erlang) and I’m pretty sure I can use them too.
  • more than 5 years of FFmpeg development, started it with reverse-engineering codecs too
  • 3-4 years experience on enterprise development (client-server systems, RDBMS, whatever)

Some notes on old WMV3

January 1st, 2010

Vad kan man göra på nyårsafton om han dryck inte alkohol och han är FFmpeg developer också? (Translation: what can man do on evening before New Year if he does not drink alcohol and he is FFmpeg developer?).

Naturally, I spent it hacking at some codec. One of the most annoying issues with it (at least for some people like me) is that it does not support some features, like interlaced VC-1 and decodes only I-frames in old version of WMV3. So I’ve tried to fix the latter.

There is a flag called RES_RTM which tells whether it’s final bitstream format (a.k.a. “release to manufacturer”, hence the name) or not. I’ve tried tracing its effect through binary decoder and it turned out that it only alters transform pattern decoding.

Here’s a brief outline of transform concept: WMV2, WMV3 and VC-1 may partition P or B blocks into 4×8, 8×4 or 4×4 subblocks; each subblock, obviously, may be coded or not, so decoder also needs to extract subblock coding pattern unless transform type specifies it (like “8×4 transform, left half coded”).

It turns out that older version uses different condition for subblock coding pattern reading (for 8×4 and 4×8 transforms only), without checking whether this is transform specified only for current block or for whole macroblock (i.e. group of 6 blocks). Quick hacking has not made FFmpeg decode those old file correctly though, so there is still some work left to figure out correct condition.

2009 — a Year of Hopes

December 31st, 2009

Well, I think it’s time to look at FFmpeg achievements for this year.
Main achievement is without any doubt breeding Sus Scrofa Avionica — FFmpeg 0.5 release. It took only 5 or so years.

What has been added (mostly from Changelog):

  1. FunCom ISS demuxer and decoder
  2. various Electronic Arts formats support
  3. Gopher protocol
  4. MXF D-10 muxer
  5. Ogg muxer improvements
  6. VQF decoding support
  7. PCX encoder
  8. Cook multichannel support
  9. DPX decoder
  10. ALS decoder
  11. WMAPro decoder
  12. Atrac 1 decoder
  13. many other smaller additions and enhancements

What I did:

  1. out WavPack decoder now support hybrid coding as well
  2. PB-frames support for H.263 decoder, so Intel I263 files are played more or less fine
  3. some RTMP support
  4. Aura 1 and 2 decoders
  5. fixed long-standing bug in Interplay Video MVE — now it detects 16-bit variant of video (and decodes it)
  6. some improvements on SwScaler — 48-bit RGB support and rewriting our YUV to RGB conversion code, so SwScaler could be used under LGPL

But mostly this year gave us hopes for a lot of even more important things:

  1. there is hope that our AAC encoder will work
  2. there is hope that our AAC decoder will support SBR (and we can remove libfaad2 support)
  3. there is hope that RTMP input will work with all commercial servers as supposed
  4. Bink support is at least 90% complete, it’s mostly cleanup and bugfixing left to do
  5. Indeo 4 and Indeo 5 decoders. Boy, how I miss them!
  6. Apple codecs (ProRes, Intermediate codec).
  7. Sipro decoder will be hopefully committed
  8. hopefully, Windows Media codec family support will be almost full too, the only things left to RE are their speech codec, lossless audio codec, screen capture codecs and some WMV3 leftovers
  9. There is hope that some other unfinished work will be completed. For example, I have almost working LucasArts SMUSH video player based on some older work by somebody from ScummVM, MS Video 1 encoder. Other people for sure have some unfinished stuff too.
  10. I have a faint hope that Discworld 3 FMV will be playable with something opensourced. And then by FFmpeg.
  11. Finally, I hope Mike will finish his Xan WC4 decoder.

Anthems

December 30th, 2009

Once I wondered about different anthems. Looks like one of the things Wikipedia misses on them is classification by content.

Personally I can split them into those categories:

  1. praise of the land — many anthems tell about features of the country like mountains, rivers, lakes, green meadows, whatever.
  2. praising love to the land — a lot of anthems say something like “we love this country”
  3. praising freedom — some anthems are mostly about defending freedom or how the land is good after getting freedom
  4. praising some symbol — mostly sovereigns or flags or people
  5. religious prayers or oaths to save or protect the country

One subgroup is anthems inspired or influenced by Polish anthem. They say virtually “That is not dead which can eternal lie.
And with strange aeons even death may die.”
ahem, sorry “our land is not dead while we live”. Ukraine and Israel have such anthems.

So, here are interesting ones:

  • Polish anthem — known for its dance music
  • anthem of Andorra — not so many anthems are sung from the first person view (i.e. like the country itself tells its story)
  • anthem of Moldova praises its language

My favourite is unofficial anthem of Sweden (since there is no officially approved anthem there). The ending of verse two is rather dear to me:


Jag vet att Du är och Du blir vad Du var.
Ja, jag vill leva jag vill dö i Norden.

Translation:


I know that you are and you will be as you were,
Yes, I want to live I want to die in the North

Indeed, I live in a country which sucks greatly, sucked and I know it will suck; I also want to live in some civilised country at North (especially Sweden). Någon, ta mig till Norden, är du snäll.

A Short MSS1/2 Description

December 27th, 2009

Some of you may have heard of such thing as Mi***soft screen capture codecs (aka WM Screen decoder). Here I’ll try to present some known information about it.

First of all, this codec employs old-school arithmetic coder — traking high and low values instead of low and range, normalisation by one bit, and it’s used for decoding everything (1-bit numbers, 8-bit numbers and symbols with variable probabilities) without any additional context-dependency stuff.

This codec employs predicting pixel value by its 4 neighbours (up left, up, up right and left). Intraframes and interframes seem to be coded in same way, but intraframes may have first line(s) data coded explicitly as 8-bit values (or was that palette?).

Each frame is recursively partitioned, final partitions may be filled with some value, restored from predicted values or copied from somewhere else.

Partitioning is done in this way: decode mode, if its value is 2, do something on decoded rectangle, otherwise decode partitioning value and use it to split rectangle horizontally (when mode is zero) or vertically (mode=1) and perform the same operation on halves.

Do not expect decoder to be ready any time soon though. At least not if someone else makes it.

Pending work for FFmpeg

December 19th, 2009

Here are some pictures decoded with game decoders I’ve more or less finished in my free time:

While logotype in the middle should be recognisable to almost everybody (it’s from video file embedded in another player/converter for that format), others are not so famous.
Yes, colour planes are swapped but that’s not critical.

Left picture is taken from Wing Commander IV trailer packed with Xan codec. It has a very long history — it was 90% complete even before I joined FFmpeg project. The only caveat was that it outputs YUV format while Mike thought it was 16-bit RGB. Also nobody was interested in completing it (including Mike and me). Well, it’s almost there.

Right picture is from Descent III intro encoded with Interplay Video 16-bit version. I’ve looked at it once, almost got it right. Main thing I missed is that is stores motion vector data at certain offset, not along with other data as it did in 8-bit version. Now it plays fine though.

Another funny thing I remember is that there were complaints on detection of 16-bit variant. And what do you know? That information was available for ages at container description page. Sometimes it’s useful also to read Multimedia Wiki, not only write to it.


What next? I don’t know, there so many things to do — finish Flash Video 2 decoder, integrate Auravision 1/2 decoder before it rots, have another stab at some formats like Apple Intermediate codec or some codecs from Windows Media family.

At least I know that FFmpeg may be a bit closer to its one of unofficial ultimate goals — converting everything.

Gdium optimizations

November 24th, 2009

Since I’m not going to work at this soon (have more stuff to do), I publish that stuff I did. Grab
tgzipped sources here. Most of it does not give any significant speedup because of internal Loongson structure, so it’s just proof of a concept.

A joy of underpowered hardware

October 21st, 2009

I prefer to develop on underpowered hardware since it makes you want to squeeze all you can from it. Looks like Gdium netbook is an ideal candidate when it comes to being underpowered (BeagleBoard is too underpowered in that matter).

What really sucks in Gdium (to my taste):

  • video card performance — in MPlayer video output time tends to be more than decoding time (which mostly don’t have SIMD optimisations for Loongson). Watching something greater than 512×384 MPEG-4 video is not comfortable. Floating-point audio codecs also take a lot of CPU.
  • there is an audible noise in headphones during playing audio; since video card, audio chip and some other things are integrated into single SM501 chip, I can add that it seems to be the suckiest part in netbook.

Some things are annoying too, like having 16-bit display (while chip supports 24-bit output, that does affect picture), battery charge limit of 97-98% (so it’s always charging and never completely charged — probably some glitch on my sample), fan and temperature issues (specs say that CPU dissipates up to 4 watts, where all that heat comes from?) and probably having an internal drive instead of USB key should greatly increase performance too.

I still hope for something like notebook containing multi-core MIPS or ARM with (preferably) 1Gb RAM.

A Bit of New Hardware

September 28th, 2009

I’ve finally got SheevaPlug which will be my new server instead of Artigo 1000 which seems to have internal power management broken. Also it will help me in my plans of decreasing x86 share in my boxes. The only uses I find for x86 netbook now are reverse-engineering and running an occasional game, everything else I do on other boxes as well.

Looks like FedEx at least here is going downhill. While two weeks of delay (aka “custom clearance”) is pretty usual for me, from this year one has to go to their office to sign some papers and pay custom fee before they finish custom clearance and deliver package to your town. I had to go there second time to pick up the package (and before that packages were delivered straight to my place except for one case when it went back to USA). Not that 2.5km walk can harm.

Another thing worth mentioning is that my Gdium now has probably the fastest MPlayer — I’ve ported several lavc MMX-accelerated functions to it, so now H.264, RV3/4 and H.26[13]-based formats decode faster (the latter by couple of ten percents faster, others by 5-10%), not mentioning Monkey Audio which is now possible to listen to in realtime even files packed on insane level. Maybe in distant future they will hit SVN (if I clean them and Måns finds time for review).