Archive for August, 2013

Voxware Codecs and Tags

Saturday, August 10th, 2013

If you look at the registry of WAV formats you can see this:

0x0070 WAVE_FORMAT_VOXWARE_AC8 Voxware, Inc.
0x0071 WAVE_FORMAT_VOXWARE_AC10 Voxware, Inc.
0x0072 WAVE_FORMAT_VOXWARE_AC16 Voxware, Inc.
0x0073 WAVE_FORMAT_VOXWARE_AC20 Voxware, Inc.
0x0074 WAVE_FORMAT_VOXWARE_RT24 Voxware, Inc.
0x0075 WAVE_FORMAT_VOXWARE_RT29 Voxware, Inc.
0x0076 WAVE_FORMAT_VOXWARE_RT29HW Voxware, Inc.
0x0077 WAVE_FORMAT_VOXWARE_VR12 Voxware, Inc.
0x0078 WAVE_FORMAT_VOXWARE_VR18 Voxware, Inc.
0x0079 WAVE_FORMAT_VOXWARE_TQ40 Voxware, Inc.
0x007A WAVE_FORMAT_VOXWARE_SC3 Voxware, Inc.
0x007B WAVE_FORMAT_VOXWARE_SC3 Voxware, Inc.
0x0081 WAVE_FORMAT_VOXWARE_TQ60 Voxware, Inc.

In reality there’s one codec with several variations (MetaSound) and a family of low-bitrate MetaVoice codecs. And it doesn’t really matter what ID you’ll use — codec extradata contains real tag used to distinguish one codec from another. That’s why we can have 0x0075 format reserved for Voxware RT29 speech codec but used by MetaSound instead.

Here’s the list of internal tags:

  • VOXa — MetaVoice RT24, 8 kHz, mono, 2.4kbps
  • VOXb — MetaVoice VR12, 8 kHz, mono, 1.2kbps (variable bitrate)
  • VOXc — MetaVoice VR15, 8 kHz, mono, 2.4kbps (variable bitrate)
  • VOXg — MetaVoice RT29HQ, 8 kHz, mono, 2.98kbps (called high-quality for some reason)
  • VOXh — MetaVoice RT28, 8 kHz, mono, 2.8kbps
  • VOXi — MetaSound AC08, 8 kHz, mono, 8kbps
  • VOXj — MetaSound AC10, 11 kHz, mono, 10kbps
  • VOXk — MetaSound AC16, 16 kHz, mono, 16kbps
  • VOXL — MetaSound AC24, 22 kHz, mono, 24kbps
  • VOXq-VOXz — MetaSound mono and stereo, various formats
  • VX01 — MetaVoice SC3, 8 kHz, mono, 3.2kbps (embedded)
  • VX02 — MetaVoice SC6, 8 kHz, mono, 6.4kbps (embedded)
  • VX03 — MetaSound, 8 kHz, mono, 6kbps
  • VX04 — MetaSound, 8 kHz, stereo, 12kbps

So, maybe RT29 does not exist and it should be RT28 instead; obviously RT29HW is a typo for RT29HQ and the second SC3 should be SC6 in the registry (and unfortunately there’s no information about TQ40/TQ60). But who is going to correct WAVE formats list because of facts?

P.S. It would be nice to receive samples for all MetaSound modes (encoder is still available and should work on older Windows systems).