Archive for the ‘Psychoacoustics’ Category

General psychoacoustic <-> coding interaction principles

Thursday, March 5th, 2009

OK, let’s suppose we have some abstract subband coder. What it does? It performs some transform on input block of data (like MDCT or QMF filterbank) then obtained frequencies are grouped, quantized and coded.

There could be many approaches but usually there are two general principles employed:

  • Some frequencies matter more than another.
  • Energy carried by subbands matters too.

Psychoacoustic model gives us a list of subband weights meaning their importance. Now what encoder could do with them? Quantize input data and code it. There are three approaches:

  1. Perform optimal coding using psychoacoustic data (good but slow)
  2. Do some heuristics to get some quick and dirty approximation (most popular approach)
  3. Ignore psychoacoustics completely (seems to be popular too)

Optimal coding may be done by employing Vitterbi method in one form or another. Heuristics are usually done in that way: give some initial prediction value for quantizer then refine it a bit until result is close enough to desired one.

More on AAC-specific coding later.