Archive for the ‘Audio’ Category

AAC: weekly report

Tuesday, July 1st, 2008

I’m working on creating psychoacoustic model from recommendations presented in 3GPP TS26.403. Implementation is very rough but at least it can produce the files with desired bitrate (not quite that bitrate but ~2kbps around it).

Now the tasks are to eliminate noise from encoded material and add block switching. Maybe window switching as well.
Oh, and commit that all to FFmpeg SVN.

AAC: going to psychoacoustics

Tuesday, June 24th, 2008

Looks like Gabriel Bouvigne of mp3-tech.org (how many information I got from there!) and Lame fame took interest in AAC encoder. For now I’m following his advise and trying to implement psychoacoustic model after 3GPP TS 26.403 document. It should be simple yet effective enough.

In the other news: AAC decoder mutates to become fit for FFmpeg SVN inclusion. I hope that will happen soon. Keep going, Robert, and keep reviewing, Michael!

Update on AAC progress

Monday, June 16th, 2008

If you are interested in what happens with my encoder, here’s a piece of report.

Simple encoding works. That means you can encode files with it now and they can be played back and you’ll be able to recognize the sound. Also I’ve separated psychoacoustic model and encoder itself, so it calls model to ask what windowing to use and what scaling/coefficients to encode.
Can I say this concludes the task for this summer of code? Technically yes but there are few points I ought to finish.

Encoder side:

  • MDCT for the cases different from simple 1024-point window (8 short windows sequence and two transition windows)
  • correct bitstream writing for 8SS case
  • probably multichannel encoding (it’s useless until we have defined multichannel audio API though)

Psychoacoustic model(s) side:

  • good psychoacoustic model ๐Ÿ™‚
  • quantizer which allows rate control
  • something else?

I can add some models after the work is complete too and probably tune it for my ears and music I like to listen to. Reading papers I got on psychoacoustic models should help.

Back to work then.

Some progress in AAC encoder

Monday, June 9th, 2008

OK, now I have simple and not very correct AAC encoder. Because of quantization step missing (spectral coefficients should be downscaled by cube root from them) resulting AAC becomes louder and FAAD complain on quantisation value being too large. FFmpeg future AAC decoder just silently clips it. In any case, it produces sound close to original.

Since no psychoacoustics is employed for now, bitrate is too high (~400kbps per channel, no joint stereo savings too).

So, the plan is to:

  • Fix and optimize bitstream writing (yes, bitstream packing is far from optimal too)
  • Psychoacoustic model (I hope it will be easier than multichannel audio API in FFmpeg)
  • Bitrate control

Back to work…

Year of AAC in FFmpeg

Saturday, May 31st, 2008

I’ve started working on AAC encoder for FFmpeg. I’ve bricked (=made a dead-tree brick copy) a bit of standard (it’s really big) and have written a bit of code too. Hopefully we will have fully working AAC encoder to the end of summer. It’s time to get rid of libfaac and libfaad dependencies!

The phrase chosen as title was coined by Robert Swain, who works on bringing GSoC-2006 AAC decoder to FFmpeg and adding SBR support to it.

Audiovisual debugger

Sunday, November 4th, 2007

I have never though about FFplay in that way but it had struck me today that waveform visual display is one of the best ways to debug it.
Why?


FFplay

One of C.P.E. Bach’s Wurttemberg sonatas (a small excerpt, really)

Because it gives you those advantages:

  1. Noise hurts your eyes less than ears
  2. Some inaudible artifacts (like DC bias) are easily spottable
  3. Clipping and volume change is easily spottable too
  4. Stereo differences are easy to find
  5. It may give you some aesthetic pleasure ๐Ÿ˜‰

I must also add that most audio player have visualizers but they lack simplicity and usability of this 640×480 clean waveform rendering.

Musepack SV8 is almost ready

Wednesday, September 26th, 2007

Judging from this post there is an eight steam version of Musepack in beta testing (but the spec is not frozen yet).

What distinguishes it from previous versions? Now it is container-aware. Previous versions store just audio frame in the continuous bitstream with no defined behavior on seeking nor demuxing. It was almost as fun as Monkey’s Audio container.

Now Musepack has a chance to spread in the wild (i.e. in other containers than .MPC). That mostly depends if there will be standard way to store them in the other containers (that’s where Ogg Vorbis failed). Well, good luck.

Monkey Audio

Sunday, June 10th, 2007

Thanks to Peter Lemenkov who pointed me to this Monkey’s Audio decoder implementation. It has four strong points: GPL, C, small and clean. Oh, it also takes less memory too.

The only drawback is that old APE files are not supported (but nothing can play them on PPC anyway without x86 emulator) so I’m eager see APE support in MPlayer, Xine, VLC (or maybe it’s there already?). Preferably via libavcodec ๐Ÿ˜‰

Why we should have another Monkey’s Audio decoder implementation

Sunday, March 25th, 2007

Why should I bother about Monkey’s Audio? Because many pirates good people offer classical music in this format (FLAC is quite rare and I’ve seen WavPack only once).

What I consider wrong in Monkey’s Audio design:

  • No verson compatibility – each version alters decoding process
  • Huge blocks – some megabytes is huge indeed (WavPack – 64k, FLAC – even less), hence inaccurate seeking and big memory requirements
  • “Insane” profile – if it does not decode in realtime on my CPU that is unusable

What I consider wrong in MA implementation:

  1. There is only one implementation (with two known ports)
  2. It is not endian-safe (both generated WAV headers and < 3.92 decoding)
  3. OO in that case means “Object-Obfuscated” (i.e. too many files where you can’t easily find required code)
  4. Custom license

Maybe during GSoC somebody will write easily understandable portable decoder in Lavc that will allow playback of .APE in FFplay,MPlayer,VLC,Xine,etc. Otherwise I’ll have to do it myself.

Variety of lossless audio codecs

Saturday, September 23rd, 2006

There are currently 14 lossless audio codecs mentioned on MultiMedia Wiki page (look here for further links):

  • Proprietary (Apple Lossless, Meridian Lossless Packing, Real Lossless, WMA Lossless)
  • Closed source (LA, LPAC, LTAC, OptimFROG, RK Audio)
  • Open source (Bonk, FLAC, MPEG-4 ALS, Monkey’s Audio, Shorten, TrueAudio, WavPack)

FFmpeg currently has decoders for Bonk, FLAC, Shorten, TrueAudio and Apple Lossless. So, there are at least MPEG-4 ALS, Monkey’s Audio and WavPack decoders can be added.

I will work on WavPack decoder and ALS (I hope standard will appear soon). What about Monkey’s Audio? Yes, it’s popular but it has following difficulties for implementation:

  1. It has incredibly largeย  frame sizes (it may be more than one million samples) while competitors stick around 64k or less (hence the compression gain for MA). Current FFmpeg design cannot handle such frames.
  2. Source code is a mess – for almost every action there are at least several if(ver >= …) or if(ver< ...). Format is too unstable for me.

Well, I still hope it will be implemented some day.