On Some Smaller Railway Details

September 12th, 2013

When I visited VDD13 (they’ve finally made it right — with Trocadero and surströmming, hopefully they’ll keep the level in the future) I could make some additional observations on the rail that finally lead to this post.

First, I like to talk about catenary constructions. They usually come in two variations — a single post with a special support for the wires or two posts with a horizontal construction between them that supports the wires.
It might be hard to believe but they can be æsthetically appealing too.

The top on my list is Sweden (and Netherlands since they seem to employ the same construction). The poles are made from lattice and thus are nice and horizontal supporting constructions are always made as trapezoids.

Runner-up is Switzerland — they also have nice lattice constructions but they seem a bit compressed to me.

Germany has the same poles but for wider ranges they usually have only a wire between poles from which the wire-supporting constructions hang.

Most of the other contries have simple round masts or H-shaped beams that are not interesting, though I must admit French ones have nice wire support constructions reminding of violin bow.

And on the very bottom of the list is Denmark with its large ugly ?-shaped masts in the colour of rust.

And now to the second thing — toilets. I usually try to avoid them but sometimes I feel I have to visit one onboard. So here’s a comparison of that essential thing.

Ukraine — toilets on so-called “InterCity+” should not be that bad, toilets in older carriages are better not be visited at all (and since they dump contents onto tracks directly, toilet rooms are locked long before stations and after them).

Germany — on InterCity trains toilets are decent (or at least tolerable), on ICE they are too small even in the first class. Even on regional trains and trams it seems to be bigger.

France — on TGV first class they are even smaller that on ICE, in the second class even a person smaller than me has problems fitting inside.

Sweden — those people really care. The spaciest toilet rooms on trains I can remember (especially on Reginatåg).

Switzerland — the toilet on Rhätische Bahn regional trains seemed quite good even if those are narrow-gauge trains.

Moral of the story — Swedish trains and railways are the best (and if you have doubts you’re reading the wrong blog).

Voxware Codecs and Tags

August 10th, 2013

If you look at the registry of WAV formats you can see this:


0x0069 WAVE_FORMAT_VOXWARE_BYTE_ALIGNED Voxware, Inc.
0x0070 WAVE_FORMAT_VOXWARE_AC8 Voxware, Inc.
0x0071 WAVE_FORMAT_VOXWARE_AC10 Voxware, Inc.
0x0072 WAVE_FORMAT_VOXWARE_AC16 Voxware, Inc.
0x0073 WAVE_FORMAT_VOXWARE_AC20 Voxware, Inc.
0x0074 WAVE_FORMAT_VOXWARE_RT24 Voxware, Inc.
0x0075 WAVE_FORMAT_VOXWARE_RT29 Voxware, Inc.
0x0076 WAVE_FORMAT_VOXWARE_RT29HW Voxware, Inc.
0x0077 WAVE_FORMAT_VOXWARE_VR12 Voxware, Inc.
0x0078 WAVE_FORMAT_VOXWARE_VR18 Voxware, Inc.
0x0079 WAVE_FORMAT_VOXWARE_TQ40 Voxware, Inc.
0x007A WAVE_FORMAT_VOXWARE_SC3 Voxware, Inc.
0x007B WAVE_FORMAT_VOXWARE_SC3 Voxware, Inc.
0x0081 WAVE_FORMAT_VOXWARE_TQ60 Voxware, Inc.

In reality there’s one codec with several variations (MetaSound) and a family of low-bitrate MetaVoice codecs. And it doesn’t really matter what ID you’ll use — codec extradata contains real tag used to distinguish one codec from another. That’s why we can have 0x0075 format reserved for Voxware RT29 speech codec but used by MetaSound instead.

Here’s the list of internal tags:

  • VOXa — MetaVoice RT24, 8 kHz, mono, 2.4kbps
  • VOXb — MetaVoice VR12, 8 kHz, mono, 1.2kbps (variable bitrate)
  • VOXc — MetaVoice VR15, 8 kHz, mono, 2.4kbps (variable bitrate)
  • VOXg — MetaVoice RT29HQ, 8 kHz, mono, 2.98kbps (called high-quality for some reason)
  • VOXh — MetaVoice RT28, 8 kHz, mono, 2.8kbps
  • VOXi — MetaSound AC08, 8 kHz, mono, 8kbps
  • VOXj — MetaSound AC10, 11 kHz, mono, 10kbps
  • VOXk — MetaSound AC16, 16 kHz, mono, 16kbps
  • VOXL — MetaSound AC24, 22 kHz, mono, 24kbps
  • VOXq-VOXz — MetaSound mono and stereo, various formats
  • VX01 — MetaVoice SC3, 8 kHz, mono, 3.2kbps (embedded)
  • VX02 — MetaVoice SC6, 8 kHz, mono, 6.4kbps (embedded)
  • VX03 — MetaSound, 8 kHz, mono, 6kbps
  • VX04 — MetaSound, 8 kHz, stereo, 12kbps

So, maybe RT29 does not exist and it should be RT28 instead; obviously RT29HW is a typo for RT29HQ and the second SC3 should be SC6 in the registry (and unfortunately there’s no information about TQ40/TQ60). But who is going to correct WAVE formats list because of facts?

P.S. It would be nice to receive samples for all MetaSound modes (encoder is still available and should work on older Windows systems).

A Quest Continues

June 28th, 2013

Well, after some distraction as writing semi-working On2 AVC decoder (it turned out that On2 has introduced some special modes there that differ only on signal reconstruction stage, too lazy to RE them) and recovering after heat wave I’ve returned to the VoxWare ElenrilSound decoder.

I hate parametric codecs — no matter how you screw calculations you’ll still get some output but it won’t be useful for debugging. At least I can use MPlayer2 + binary codec loader + gdb combination to extract runtime information from the reference decoder.

Now I’m trying to make at least one mode work properly, 16kHz@16kbps mono (aka VOXk) for now. Stereo reconstruction might be trickier so I’ll leave it for later but at least most modes differ only by the tables they use. So (in theory) I’ll need to make at least this mode work, add tables for other modes, fix stereo decoding, look at 8kHz@6kbps mode, curse and forget about it.

Good news — bit allocation works properly and bits are read exactly as in the reference decoder. Bad news — reconstructed output is not even close to the expected one, so the work continues…

How I imagine a perfect computer (for me)

June 8th, 2013

Of course this interests nobody but I wanted to rant about it for a long time.

General principles:

  1. Compact size — I like to be able to fit all of my computers on the desk, any size comparable with power supply unit size would do. Laptops are fine too.
  2. Silent — no damned fans.
  3. An ability to use normal storage, not 16MB SSB soldered onboard.
  4. No x86 CPU.
  5. If it’s a laptop it should be able to work for 10 hours with battery.

Display:

  1. 4:3 aspect ratio. If displays nowadays are made for movie-watchers then it’s a sad world. Too much of vertical space is eaten by various toolbars, menu bars and such.
  2. sane resolution. Again, 1920×1080 may be ideal for movie-watchers but I prefer it to be either VGA-based (i.e. multiple of 640×480 or 800×600) or power of two based. And whoever thought about 1366×768 should burn in hell!

Performance — if Libav compiles in ten minutes on dual core system then it’s fast enough for me.

ARM-based laptops are almost good for that, especially performance wise. There’s just one big “but” — they are almost all are for Android or chromebooks. And Baidu has never intended those systems for any real usage. Playing games — fine, browsing — passable (though Firefox 3 on my old PowerPC MacMini with 512 MB RAM gives much better experience than Chrome on tablet with 1GB RAM), editing texts (code) — absolute fail. I can live without a numpad on keyboard (it’s a legacy for accountants and their calculators after all) but not having even “delete” key (there’s only backspace) is pathetic.

So I live with a faint hope that there will be a computer good enough for me.

Some Information about VoxWare MetaSound

June 5th, 2013

So I’ve looked at the beast again. It seems to be close enough to the original TwinVQ (as in .VQF, not something that got into MPEG-4 Part 3 Subpart 4 Variation 3), so I’ll just try to document spotted differences.

Coding modes. Original TwinVQ had 9 modes, VoxWare has twice as much (and so twice as much codebooks!). One of them is very special (8kHz at 6kbps), with a cutoff of “high” frequencies. Also mode explicitly signals the number of channels so some modes are stereo-only and some are mono-only.

Bitstream format differences. Bitstream is packed LSB, the first byte is not skipped by the decoder. There’s an additional 2-bit variable right after window type present in many modes (but not 8kHz@6kbps or when short windows are used), my guess is that it has something to do with intensity stereo. Some parts order seems to be shuffled (i.e. original TwinVQ used p_coef, g_coef, shape order, MetaSound uses p_coef, shape, g_coef order).

Reconstruction. I’m not familiar with TwinVQ much myself but it looks like there are some small differences there as well. For instance, pgain base is 25000 for mono and 20000 for stereo and in bark decoding scales 0.5, 0.4, 0.35 are used instead of 0.4, 0.35, 0.28 (not really sure about that bit).

Any help with the decoder is welcome — new decoder will reuse most of the current TwinVQ decoder after all and new tables (it should take the title of decoder with the biggest tables from DCA decoder).

A New Month, Some New Goals

June 1st, 2013

As suggested by Anton, it’s the month of overengineered codecs.

The goals are the following (warning: they are subject to change without any notice)

  • work on REing VoxWare MetaSound (the thing aforementioned Anton should have done long time ago — it is only slightly different from stock TwinVQ decoder after all);
  • make proper ClearVideo decoder, currently it supports I-frames only in AVI and RM (samples in QuickTime are welcome BTW);
  • work on REing Discworld III video format;
  • On2 AVC decoder;
  • make Mike M. reverse engineer On2 VP4;
  • add raw mode for IMC/IAC;
  • work on Indeo 4 B-frames support (yeah, very likely);
  • push G2M4 (aka Go2WatchBoringSlideshows, do not confuse it with Go2BoringEnterpriseEvent codec) decoder.

Sheer Madness

May 22nd, 2013

(luckily there’s not much left of this month of intermediate codecs)

So I’ve looked at another intermediate codec, post title hints on both its name and design. Coding scheme is rather simple: you code lines either in raw form or with prediction (from the left neighbour for the top line or (3 * L + 3 * T - 2 * TL) >> 2 for other lines, prediction error is coded with fixed Huffman codes.

Simple, right?

Here’s the catch: there is an insane number of formats it supports, both for storage and output and there’s an insane number of decoding functions for decoding format X into format Y.

So quite probably no decoder — not interesting and too tedious.

ProRes alpha support is almost there

May 17th, 2013

I’ve finally brought myself into looking at alpha plane decoding support for ProRes. It was a bit peculiar but rather easy to reverse engineer. Now I only need to update my ConsumerRes decoder to support it.

And that’s probably enough for the month of intermediate codecs.

A Well-designed Intermediate Codec

May 12th, 2013

The adjective is referring to the hype that the company that made this codec is run by designers (unlike some other companies where even design is made by developers or — even worse — marketers). And let’s call it AWIC or iNtermediate codec for short. Let’s not mention its name at all.

It is a rather old codec and it codes 8-bit YUV420 in 16×16 macroblocks with DCT, quantisation and static codes. Frame is divided into slices in such way so that there are not more than 32 slices on one line (and slice height is one macroblock). The main peculiarity is having scalable mode — every macroblock is partitioned into 8×8 sub-macroblock (i.e. 8×8 luma block and two 4×4 chroma blocks) with the following data for the rest of the block, and this is exploited for decoding frames in half-width, half-height or half-width half-height modes.

Maybe I’ll write a decoder for it after all.

In ten years every codec becomes Op^H^HJPEG

May 11th, 2013

So, RAD has announced Bink 2. While there are no known samples or encoder, decoder is present in RAD game tools already. For some random reason (what I have to do with Bink anyway?) I decided to look at it.

Format is probably the same except that preferred extension is .bk2 and it starts with 'KB2f' instead of 'BIKf' or 'BIKi'.

The main features they advertise are speed and dual-core decoding support. Most parts of the code are SIMDified indeed and as for dual-core decoding support it seems to be fulfilled with breaking frame into top and bottom half (not that I’ve looked at it closely but strings in the player suggest that).

Now about the format itself. Bink2 operates in YUV 4:2:0 format with optional alpha and employs 8×8 DCT with 16×16 macroblocks. There are not many interesting details in the coding itself: DCs are coded separately before ACs, three quantisation matrices — two for luma/alpha (for intra and inter blocks) and one for chroma, static codes are used for coding them (compare that to the way it was done in Bink Classic), motion compensation is halfpel for luma and quarterpel for chroma now with bicubic interpolation. There are four modes for coding block: intra block, skip block, motion-only block and motion compensation with residue coded.

There seems to be some postprocessing they rightfully call “Blur” but I’m not that sure about it.

What can I say about the codec overall? It’s boring. While Bink 1 is not that fast it was much more fun to RE: coding values in bundles ­— I’ve rarely seen that (Duck TrueMotion 2 comes to mind and that’s all), various coding techniques — vector quantisation and DCT (as I’ve mentioned above, coding DCT coefficients was rather unique too) and some other tricks (unusual scans, specially coded block difference, double-scaling blocks, etc. etc.).

Overall, Bink2 will probably be what it’s promised to be (fast, portable codec for games) but it won’t have the real spirit of Smacker and Bink design. Or maybe it’s just me getting older.

P.S. I wonder if they start providing logo in Bink2 file embedded in player like they do with Smacker and Bink players.

P.P.S. This post title is inspired by a certain German saying about cars in case it wasn’t obvious.