Indeed, my expectations on completing at least field-interlaced VC-1 support can be represented as (possibly modulated) sine wave.
So here’s a short log of my mood during my attempts on implementing it:
- initial state — no feelings at all
- discovered old unfinished patch — mood goes up
- tried to make it decode more than one field — fail, mood goes down
- found out that first frame is actually composed of I and P field — mood goes up
- looked at decoded picture — mood goes down
- “that P-frame structure differs a bit but that’s all” — mood goes up
- read about actual motion compensation routine for interframes and related bitreading —can you guess the consequences?
Some may argue this may be better represented with triangle or sawtooth wave though.
Seriously, now I understand why some people consider interlaced video to be evil. First of all, it’s an artefact of bandlimited era. Thus it adds unnecessary complexity to modern progressive systems.
I’m pretty sure there are people who will cry when they hear about interlaced coding and coded field order. There may be people who wince at word “telecine”. There may be H.264 interlaced modes (yes, several of them, MBAFF seems to be most popular) decoder implementers. Probably I’ll join one of those groups too.
Seriously, I consider adding interlaced mode (at least to some codecs) an offence against humanity.
I don’t see why interlaced decoding must differ from progressive one that much. Okay, we have two fields and we may need to select better reference for one of them. No problem. Select two references for motion vector prediction (which is described as several pages of blah-blah-code, yes, that comprehensible)? Gentlemen, include me out!
To make things worse they decided to complicate motion vector encoding along with prediction. Honestly, one should suspect that field MVs should be smaller due to fields having half of original picture height; in reality there is an additional group of bits read to augment motion vector. Why?
And a bit of icing. FFmpeg seems not to be adapted well for interlaced decoding. For instance, who knew that you should use picture->linesize[0]
instead of mpegenccontext->linesize
because the former will be used in calculating offsets for current block data and if you set mpegenccontext->picture_structure
to say PIC_TOP_FIELD
it will modify something for you? Not to mention how to deal with motion block data needed for reference (I honestly have no idea how well it will work).
Thus, I invite somebody able and fearless to finish this task. I don’t have any samples to interest me (for the reference, in the best times my DVD collection was around two or three discs, guess the number of Blu-Rays I own) and I found better ways to spend my time (and probably more interesting codecs as well).
P.S. Moving VDPAU chunks to dedicated AVHWAccel is also needed and is trivial even for somebody without deep FFmpeg knowledge.