While work on AAC encoder is slowly progressing (now it’s mostly psychoacoustics left to do and maybe HE-AAC if somebody will convince me), I’m looking at side tasks to make my life a bit more colourful.
For now those tasks are writing SSE2 optimization for Monkey’s Audio decoder (and that is the first piece of SIMD assembly I’ve ever written) and working on RV40 loop filter.
To give people false hope, it’s more understandable by now. Only one function argument is not obvious. And Dark Shikari, you were wrong – RV40 is 99,5% alike with H264 draft (not 99% you said), as loop filter is suspiciously similar to H264 one.
Again on RV40 loop filter
July 10th, 2008AAC: Nachrichten pro Woche
July 5th, 2008Here is this week portion of AAC-related news:
- I was working on psychoacoustic model and fixes for it. Now encoder should always produce correct files (i.e. decodable without bitstream errors). Sound quality may be low though.
- There was a bug in MDCT calculation which resulted in wrong spectrum.
- My test device for AAC has broken 🙁 Where I can find a decent pair of headphones that won’t break that easily? Especially in this country.
And just in case my mentor’s reading this, here are my plans:
- Improve and finish 3GPP TS26.403-based psychoacoustic model.
- Implement block switching.
- Add sine windows.
- Sync my encoder with current AAC decoder code (maybe it will be committed by then?)
AAC: weekly report
July 1st, 2008I’m working on creating psychoacoustic model from recommendations presented in 3GPP TS26.403. Implementation is very rough but at least it can produce the files with desired bitrate (not quite that bitrate but ~2kbps around it).
Now the tasks are to eliminate noise from encoded material and add block switching. Maybe window switching as well.
Oh, and commit that all to FFmpeg SVN.
AAC: going to psychoacoustics
June 24th, 2008Looks like Gabriel Bouvigne of mp3-tech.org (how many information I got from there!) and Lame fame took interest in AAC encoder. For now I’m following his advise and trying to implement psychoacoustic model after 3GPP TS 26.403 document. It should be simple yet effective enough.
In the other news: AAC decoder mutates to become fit for FFmpeg SVN inclusion. I hope that will happen soon. Keep going, Robert, and keep reviewing, Michael!
Update on AAC progress
June 16th, 2008If you are interested in what happens with my encoder, here’s a piece of report.
Simple encoding works. That means you can encode files with it now and they can be played back and you’ll be able to recognize the sound. Also I’ve separated psychoacoustic model and encoder itself, so it calls model to ask what windowing to use and what scaling/coefficients to encode.
Can I say this concludes the task for this summer of code? Technically yes but there are few points I ought to finish.
Encoder side:
- MDCT for the cases different from simple 1024-point window (8 short windows sequence and two transition windows)
- correct bitstream writing for 8SS case
- probably multichannel encoding (it’s useless until we have defined multichannel audio API though)
Psychoacoustic model(s) side:
- good psychoacoustic model 🙂
- quantizer which allows rate control
- something else?
I can add some models after the work is complete too and probably tune it for my ears and music I like to listen to. Reading papers I got on psychoacoustic models should help.
Back to work then.
Some progress in AAC encoder
June 9th, 2008OK, now I have simple and not very correct AAC encoder. Because of quantization step missing (spectral coefficients should be downscaled by cube root from them) resulting AAC becomes louder and FAAD complain on quantisation value being too large. FFmpeg future AAC decoder just silently clips it. In any case, it produces sound close to original.
Since no psychoacoustics is employed for now, bitrate is too high (~400kbps per channel, no joint stereo savings too).
So, the plan is to:
- Fix and optimize bitstream writing (yes, bitstream packing is far from optimal too)
- Psychoacoustic model (I hope it will be easier than multichannel audio API in FFmpeg)
- Bitrate control
Back to work…
Still in Memphis…
June 4th, 2008Although it’s no good to kick a dead horse discuss the work of governmental establishments, I still want to cry about total ineffectiveness of our customs.
Today I’ve received stored value card from Google sent on May 27. They have also sent a book. On May 15th. And it’s still not here.
Here is the log – see for yourselves: .
And something stops me from believing custom officers take their time reading that book.
Last year package sent by Mike took a whole month rest. Not mentioning $10 fee for those 12 recorded DVD+Rs.
Surprisingly, x86 box I’ve recently got was delivered in a week by Express Mail or something similar.
Good system architects optimize system performance by removing the biggest and most probable delay. If delivering a package takes a week and clearing at custom takes at least two then what makes the weakest, slowest and ugliest link? Where can I submit a patch to this process?
P.S. For those who don’t know the title origin – look here.
Year of AAC in FFmpeg
May 31st, 2008I’ve started working on AAC encoder for FFmpeg. I’ve bricked (=made a dead-tree brick copy) a bit of standard (it’s really big) and have written a bit of code too. Hopefully we will have fully working AAC encoder to the end of summer. It’s time to get rid of libfaac and libfaad dependencies!
The phrase chosen as title was coined by Robert Swain, who works on bringing GSoC-2006 AAC decoder to FFmpeg and adding SBR support to it.
VC-1 test source
May 23rd, 2008To my great surprise there are people working with reference decoder and asking for files with some features and digestable by it. Well, here is an ultimate answer – RCV muxer for FFmpeg (hey, writing a muxer and patching FFmpeg build system were two things I haven’t done yet).
You can get FFmpeg sources, apply this patch, compile and try ffmpeg -i file.wmv -vcodec copy out.rcv
. Good luck!
BIG FAT WARNING. It is not guaranteed to work with reference decoder on all files. Try files with smaller dimensions.
Or hack reference decoder to ignore image dimensions (ffplay would play those files correctly while reference decoder complains on something about “image size is too big for this level”).
RV: present state
May 18th, 2008If you are interested in what’s going with my RV decoder from GSoC 2007 then here are your answers.
What works:
- RV30 decoding mostly works
- RV40 decoding mostly works
- Pictures are quite recognizable
What needs to be resolved:
- RV40 loop filter
- RV30 loop filter (a bit easier)
- RV30 motion vectors in B-frames (sometimes they are a bit jumpy)
- RV30 chroma problems (colours are always moving to the upper left corner of the frame – incorrect rounding?)
- RV30 slice uniting problem (some splitted slices should be united by decoder – at least I know how and when to do this)
If you want to help with loop filter then loop at
loop filter work scheme (SVG, ~128Kb) and give your proposals on how it works.
Legend (macroblock is 4×4 subblocks, no borders as they will ruin this scheme):
- numbers at the top and left eddge – macroblock numbers
- black lines – subblock edges where loop filtering took place
- hex number at the top left corner of macroblock – coded block pattern, it’s red for intra types macroblocks and for P macroblocks with DC coeffs coded separately
- blue square – coded subblock
Any suggestions (and pointers to the information about H.264 loop filtering explained clearer than in standard) are welcome.
I’d like to finish it before starting my work on AAC encoder…
BTW, you can use ffmpeg-rv.patch
from soc/rv40
repository to enable RV30/40 decoding in ffmpeg.