Archive for June, 2013

A Quest Continues

Friday, June 28th, 2013

Well, after some distraction as writing semi-working On2 AVC decoder (it turned out that On2 has introduced some special modes there that differ only on signal reconstruction stage, too lazy to RE them) and recovering after heat wave I’ve returned to the VoxWare ElenrilSound decoder.

I hate parametric codecs — no matter how you screw calculations you’ll still get some output but it won’t be useful for debugging. At least I can use MPlayer2 + binary codec loader + gdb combination to extract runtime information from the reference decoder.

Now I’m trying to make at least one mode work properly, 16kHz@16kbps mono (aka VOXk) for now. Stereo reconstruction might be trickier so I’ll leave it for later but at least most modes differ only by the tables they use. So (in theory) I’ll need to make at least this mode work, add tables for other modes, fix stereo decoding, look at 8kHz@6kbps mode, curse and forget about it.

Good news — bit allocation works properly and bits are read exactly as in the reference decoder. Bad news — reconstructed output is not even close to the expected one, so the work continues…

How I imagine a perfect computer (for me)

Saturday, June 8th, 2013

Of course this interests nobody but I wanted to rant about it for a long time.

General principles:

  1. Compact size — I like to be able to fit all of my computers on the desk, any size comparable with power supply unit size would do. Laptops are fine too.
  2. Silent — no damned fans.
  3. An ability to use normal storage, not 16MB SSB soldered onboard.
  4. No x86 CPU.
  5. If it’s a laptop it should be able to work for 10 hours with battery.

Display:

  1. 4:3 aspect ratio. If displays nowadays are made for movie-watchers then it’s a sad world. Too much of vertical space is eaten by various toolbars, menu bars and such.
  2. sane resolution. Again, 1920×1080 may be ideal for movie-watchers but I prefer it to be either VGA-based (i.e. multiple of 640×480 or 800×600) or power of two based. And whoever thought about 1366×768 should burn in hell!

Performance — if Libav compiles in ten minutes on dual core system then it’s fast enough for me.

ARM-based laptops are almost good for that, especially performance wise. There’s just one big “but” — they are almost all are for Android or chromebooks. And Baidu has never intended those systems for any real usage. Playing games — fine, browsing — passable (though Firefox 3 on my old PowerPC MacMini with 512 MB RAM gives much better experience than Chrome on tablet with 1GB RAM), editing texts (code) — absolute fail. I can live without a numpad on keyboard (it’s a legacy for accountants and their calculators after all) but not having even “delete” key (there’s only backspace) is pathetic.

So I live with a faint hope that there will be a computer good enough for me.

Some Information about VoxWare MetaSound

Wednesday, June 5th, 2013

So I’ve looked at the beast again. It seems to be close enough to the original TwinVQ (as in .VQF, not something that got into MPEG-4 Part 3 Subpart 4 Variation 3), so I’ll just try to document spotted differences.

Coding modes. Original TwinVQ had 9 modes, VoxWare has twice as much (and so twice as much codebooks!). One of them is very special (8kHz at 6kbps), with a cutoff of “high” frequencies. Also mode explicitly signals the number of channels so some modes are stereo-only and some are mono-only.

Bitstream format differences. Bitstream is packed LSB, the first byte is not skipped by the decoder. There’s an additional 2-bit variable right after window type present in many modes (but not 8kHz@6kbps or when short windows are used), my guess is that it has something to do with intensity stereo. Some parts order seems to be shuffled (i.e. original TwinVQ used p_coef, g_coef, shape order, MetaSound uses p_coef, shape, g_coef order).

Reconstruction. I’m not familiar with TwinVQ much myself but it looks like there are some small differences there as well. For instance, pgain base is 25000 for mono and 20000 for stereo and in bark decoding scales 0.5, 0.4, 0.35 are used instead of 0.4, 0.35, 0.28 (not really sure about that bit).

Any help with the decoder is welcome — new decoder will reuse most of the current TwinVQ decoder after all and new tables (it should take the title of decoder with the biggest tables from DCA decoder).

A New Month, Some New Goals

Saturday, June 1st, 2013

As suggested by Anton, it’s the month of overengineered codecs.

The goals are the following (warning: they are subject to change without any notice)

  • work on REing VoxWare MetaSound (the thing aforementioned Anton should have done long time ago — it is only slightly different from stock TwinVQ decoder after all);
  • make proper ClearVideo decoder, currently it supports I-frames only in AVI and RM (samples in QuickTime are welcome BTW);
  • work on REing Discworld III video format;
  • On2 AVC decoder;
  • make Mike M. reverse engineer On2 VP4;
  • add raw mode for IMC/IAC;
  • work on Indeo 4 B-frames support (yeah, very likely);
  • push G2M4 (aka Go2WatchBoringSlideshows, do not confuse it with Go2BoringEnterpriseEvent codec) decoder.