NihAV — A New Decoder « Kostya's Boring Codec World

NihAV — A New Decoder

After a lot of procrastination I’ve finally more or less completed decoder for I.263 (Intel version of H.263) in NihAV.

It can decode I-, P- and PB-frames quite fine (though B-frames have some artefacts) and deblock them too (except B-frames because I’m too lazy for that). Let’s have a look at the overall structure of the decoder.

Obviously I’ve tried to make it modular but not exceeding the needs of H.263 decoder (i.e. I’m not going to extend the code to make it work with MPEG-2 part 2 and similar though some code might be reused), so it’s been split into several modules. Here’s a walk on all modules and their functionality review.

codecs::blockdsp

This module has three public functions:

pub fn put_blocks(buf: &mut NAVideoBuffer<u8>, xpos: usize, ypos: usize, blk: &[[i16;64]; 6]) 
pub fn add_blocks(buf: &mut NAVideoBuffer<u8>, xpos: usize, ypos: usize, blk: &[[i16;64]; 6])
pub fn copy_blocks(dst: &mut NAVideoBuffer<u8>, src: &NAVideoBuffer<u8>,
                   dx: usize, dy: usize, sx: isize, sy: isize, bw: usize, bh: usize,
                   preborder: usize, postborder: usize,
                   mode: usize, interp: &[fn(&mut [u8], usize, &[u8], usize, usize, usize)])

One puts blocks on the framebuffer, another one adds them to the already present pixels and the third one does motion compensation by checking the source block position, filling missing data if block is (partly) located past the edge and then invoking the actual interpolation function.

codecs::h263

This is the core module for H.263-related decoding, the root module contains some declarations that are used by the decoder:

pub trait BlockDecoder {
    fn decode_pichdr(&mut self) -> DecoderResult<PicInfo>;
    fn decode_slice_header(&mut self, pinfo: &PicInfo) -> DecoderResult<Slice>;
    fn decode_block_header(&mut self, pinfo: &PicInfo, sinfo: &Slice) -> DecoderResult<BlockInfo>;
    fn decode_block_intra(&mut self, info: &BlockInfo, quant: u8, no: usize, coded: bool, blk: &mut [i16; 64]) -> DecoderResult<()>;
    fn decode_block_inter(&mut self, info: &BlockInfo, quant: u8, no: usize, coded: bool, blk: &mut [i16; 64]) -> DecoderResult<()>;
    fn is_slice_end(&mut self) -> bool;

    fn filter_row(&mut self, buf: &mut NAVideoBuffer<u8>, mb_y: usize, mb_w: usize, cbpi: &CBPInfo);
}

This is the interface for codec-specific functions. The main decoder core is provided with an instance of it and calls it for the bitstream parsing while the main logic is contained in codecs::h263::decoder. I wanted to make it as stateless as possible (i.e. in the best case it simply contains bitstream reader and all other information is passed through *Info structures).

pub enum Type { I, P, Skip, Special }
pub struct PicInfo { ... }
pub struct PBInfo { ... }
pub struct Slice { ... }
pub struct BlockInfo { ... }
pub struct BBlockInfo { ... }
pub struct MV { x: i16, y: i16, }
pub struct CBPInfo { ... }

The structures used to pass information between bitstream reading part and the main decoder: picture information, current slice information, current block information plus the option B-part of such. Also there’s CBPInfo which stores information about which blocks were coded and with what quantiser for the last two rows—this information is used during the deblocking.

codecs::h263::code

pub fn h263_idct(blk: &mut [i16; 64])
pub const H263_INTERP_FUNCS: &[fn(...
pub const H263_INTERP_AVG_FUNCS: &[fn(...

H.263 specific functions: IDCT and halfpel motion compensation (put and average).

codecs::h263::data

This module contains various tables used by H.263-based decoders, mostly codebook description for e.g. CBP, MV or coefficient codes.

codecs::h263::decoder

The core for H.263-based decoders implemented in pub struct H263BaseDecoder { } with the following functions:

impl H263BaseDecoder {
    pub fn new() -> Self { ... }
    pub fn is_intra(&self) -> bool { self.ftype == Type::I }
    pub fn get_dimensions(&self) -> (usize, usize) { (self.w, self.h) }
    pub fn parse_frame(&mut self, bd: &mut BlockDecoder) -> DecoderResult<NABufferType> { ... }
    pub fn get_bframe(&mut self) -> DecoderResult<NABufferType> { ... }
}

Yes, that’s it. parse_frame() decodes a frame when possible plus saves B-frame data in an array of B-frame macroblock descriptors (motion vectors plus restored coefficients) so that later it can render B-frame in get_bframe(). Plus it has struct MVInfo {...} that holds motion vectors for the current macroblock row plus bottom part of the previous row. Luckily there are no standalone B-frames in H.263 so I don’t need to keep more information than that.

Internally, the decoder core simply calls codec bitstream parser via provided BlockDecoder interface, reconstructs motion vectors, performs motion compensation and outputs blocks (plus in-loop filtering, plus preparing B-frame information for later reconstruction).

codecs::h263::intel263

The module that actually implements I.263-specific bits. Here you have Intel263Decoder implementing NADecoder and see how its decoding function looks like:

    fn decode(&mut self, pkt: &NAPacket) -> DecoderResult<NAFrameRef> {
        let src = pkt.get_buffer();

        if src.len() == 8 {
            let bret = self.dec.get_bframe();
            let buftype;
            let is_skip;
            if let Ok(btype) = bret {
                buftype = btype;
                is_skip = false;
            } else {
                buftype = NABufferType::None;
                is_skip = true;
            }
            let mut frm = NAFrame::new_from_pkt(pkt, self.info.clone(), buftype);
            frm.set_keyframe(false);
            frm.set_frame_type(if is_skip { FrameType::Skip } else { FrameType::B });
            return Ok(Rc::new(RefCell::new(frm)));
        }
        let mut ibr = Intel263BR::new(&src, &self.tables);

        let bufinfo = self.dec.parse_frame(&mut ibr)?;

        let mut frm = NAFrame::new_from_pkt(pkt, self.info.clone(), bufinfo);
        frm.set_keyframe(self.dec.is_intra());
        frm.set_frame_type(if self.dec.is_intra() { FrameType::I } else { FrameType::P });
        Ok(Rc::new(RefCell::new(frm)))
    }

and here’s the decoder structure definition:

struct Intel263Decoder {
    info:    Rc<NACodecInfo>,
    dec:     H263BaseDecoder,
    tables:  Tables,
}

This decoder simply initialises all tables needed for the decoding (that are later passed to an instance of Inter263BR that implements BlockDecoder and parses the actual stream) plus the instance of H.263 base decoder that actually does all the heavy work. I could’ve implemented e.g. Sorenson Spark (aka FLV1) decoder in the same way (by adding the new file very similar to src/codecs/h263/intel263.rs only with FLV1-specific bitstream parsing) but I’ll leave that to rust-av, they have FLV demuxer after all.

Actually I don’t plan writing any other H.263-based decoder though I might do WMV3 later that will support beta streams properly. Indeo 4 and 5 are probably next and IAC/IMC. Or something completely different, it’s not like I have a roadmap to follow.

P.S. On the longest sample I have (320×240, 123 seconds) it decodes all frames in 2 seconds while avconv does it in 0.94 seconds (1.5 seconds with -cpuflags 0), so I’m quite fine with the result.

This entry was posted on Saturday, June 10th, 2017 at 8:36 am and is filed under NihAV. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

One Response to “NihAV — A New Decoder”

Luca Barbato says:

June 10, 2017 at 11:58 am

Neat 🙂 It is closer to speed parity that I’d expect 🙂