Lossless audio codecs were more advanced than I thought

As I’d mentioned in a previous post on lossless audio codecs, I wanted to look at some of them that are still not reverse engineered for documentation sake. And I did exactly that so now entries on LA, OptimFROG and RK Audio are not stubs any more but rather contain some information on how the codecs work.

And if you look at LA structure you see a lot of filters of various sizes and structure. Plus an adaptive weight used to select certain parameters. If you look at other lossless audio codecs with high compression and slow decoding like OptimFROG or Monkey's Audio you’ll see the same picture: several filters of different kinds and sizes layered over each other plus adaptive weights also used in residuals coding. Of course that reminded me of AV2 and more specifically about neural networks. And what do you know, Monkey's Audio actually calls its longer filters neural networks (hence the name NNFilter.h in the official SDK and you can spot it in the version history as well leaving no doubts that it’s exactly the neural networks it is named after).

Which leads me to the only possible conclusion: lossless audio codecs had been using neural networks for compression before it became mainstream and it gave them the best compression ratios in the class.

And if we apply all this knowledge to video coding then maybe in AV4 we’ll finally see some kind of convolution filters processing whole tiles and then the smaller blocks removing spatial redundance maybe with some compaction layers like many neural network designs have (or transforms for largest possible block size in H.265/AV1/AVS2) and expansion layers (well, what do you think motion interpolation actually does?) and using RNNs to code residues left from all the prediction.

This entry was posted on Wednesday, September 23rd, 2020 at 3:04 pm and is filed under Lossless Audio, Useless Rants. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

3 Responses to “Lossless audio codecs were more advanced than I thought”

Multimedia Mike says:

September 23, 2020 at 4:39 pm

This is tangential, but what approach and tools are you using for binary reverse engineering these days?
Paul says:

September 23, 2020 at 6:03 pm

Every serious, die hard RE engineer uses radare2, not ida, not ghidra or something else.
Kostya says:

September 24, 2020 at 12:34 am

And Ilfak from Hex-Rays claims that experts and professionals use IDA and nothing else.

Good thing I’m not a serious engineer so I use Ghidra. And GDB for debugging, even for binary decoders loaded in MPlayer.