BTC is bullshit

Don’t get me wrong, the idea behind it is sound but it got overhyped and misapplied. Of course I’m talking about one of an early attempts at lossy image compression called block truncation coding, what did you think about?

For those who forgot about it (and is too lazy to read its description), the method replaces values in a block with two values below/above block mean value using that mean value, standard deviation and that number of values that were above the mean value. This algorithm is often quoted as being the one on which many video codecs (especially the ones used in various games but some standard ones like Micro$oft Video 1 and A**le Graphics(SMC) and Video(RPZA) as well) are built. And that part is a bullshit which many used to believe mostly because nobody who heard it took any time to evaluate that statement (I was no exception).

Actually I should’ve started doubting it much earlier as I’ve tried to apply it to colour quantisation (like in Video 1 encoder) and failed. The method is applicable to scalar values only (and pixels are vectors of three components in our case, you can map them to greyscale but how would you calculate two distinct colours to segment block into?) and its results are worse than using Linde-Buzo-Gray method for vector quantisation (which was presented in the paper the following year). Wikipedia has an article describing a proper image compression algorithm proposed in 1986 called color cell compression that definitely looks like the perfect candidate for all the following codecs: it describes compressing image by splitting it into 4×4 tiles, grouping pixels in those tiles using average luminance as the discriminator and calculating two colours to paint the tile as averages of those two groups. That’s how vector quantisation works and unlike BTC it does not require calculating square roots and can be implemented trivially using integer maths only. So it’s practical, gives better results (in terms of MSE of greyscale images when compared to BTC) and works on actual pixel data.

While BTC was innovative for its time and probably an important stepping stone for further methods, its relevance to the modern compression schemes is minimal (unlike colour cell compression) and calling it the base for the codecs with two-colour vector quantisation is as stupid as calling Cyrano de Bergerac the father of space flight because he mentioned travel to the Moon using gunpowder rockets in a novel of his.

6 Responses to “BTC is bullshit”

  1. […] of the encoder I used Block Truncation Coding (probably the only possible application of it even if a bit unwise), now I’m simply calculating averages of the values above/below block mean […]

  2. Fabian Giesen says:

    BTC doesn’t really require square roots, so it’s not _that_ bad. The square root terms are sqrt(q / (m-q)) and sqrt((m – q) / q) where m is a constant (block size) and q is the number of pixels above the threshold (popcount of mask in the decoder), so 0 <= q <= m. You can prep a table of the m+1 possible values with a few bits of fixed-point precision and you're good to go.

    I agree that you wouldn't use straight BTC for anything these days, but there is an obvious descendant that is relevant today, which is GPU texture compression schemes. S3TC/DXTC/BC1-7 and ASTC are obvious candidates (ETC is related, but a different branch of the tree). These _do_ bake values into a single scalar in the end by picking a line segment through RGB(A) space per block (or subset of pixels in a block in BC7/ASTC). They generally use 2-4 bits per pixel for the scalar quantizer along that line segment, not 1, but I would argue that's the natural generalization of BTC to multiple dimensions (and higher bit-depth quantizers), and it's definitely still very much practically relevant.

  3. Fabian Giesen says:

    Small addendum on the sqrt note: that’s for the encoder, if you stick with the original “basic” formulation. For a decoder, you need none of that, really. The original format sends the average and standard deviation with 8 bits each. (The sqrt to get sdev from variance is, again, narrow enough value range and precision requirements that it’s not hard to table-drive in the encoder). There’s no reason to do this. You might as well just send the two values a and b in 8 bits directly. (Which gives you a sort of proto-BC4 with 1-bit indices instead of 3b).

  4. Kostya says:

    From the decoder point of view it does not matter how you derived those two values indeed, and the encoder can do whatever it likes but the plain vector quantisation approach sounds more sensible and straightforward.

    As for texture compression, I wonder where that approach came from (RPZA codec is probably the first well-known application of that method but I’m pretty sure there was some prior research). I can agree that you may trace it to BTC but maybe there was some in-between proposal that laid the foundation (like “okay, let’s take this Color Cell Compression method but use more values instead of just two; hmm, now what if we use custom axis instead of luminance…”; not necessarily exactly like that but it’s a possible scenario).

  5. Fabian Giesen says:

    The original S3TC patent (the now-expired so safe to reference US5956431) cites both BTC and CCC as references of algorithms in the same general class, but if there is any other prior example of the “interpolated colors equidistantly spaced along a line segment” idea, I can’t find it, not in a quick search anyhow. S3TC is certainly what seems to have popularized the idea.

  6. Kostya says:

    The patent is from 1997 – when A**le RPZA codec employing the same approach had been supported by the opensource XAnim already. And the codec itself is from 1991 if you believe Wickedpedia…

Leave a Reply