_WSJ HTK-based setup_
Experimental setups for AURORA2 and AURORA3 where based on HTK, but
the current release of AURORA4 setup is based on ISIP software. In the
Chania (Crete) Meeting of HIWIRE project this topic was addressed and
after some discussion about decoding times and scripts complexity, it
was decided to evaluate other alternatives based on HTK or SONIC. Also
TUC proposed the use of lattices to seed-up decoding.
After some work on the alternatives for AURORA4 evaluations on both
ISIP and SONIC software, UGR finally decided to put some effort to
build an HTK-based evaluation framework for AURORA4 to take a final
decision about the “best” alternative. A preliminary version of such a
framework was built similar to the one distributed for the AURORA4
evaluations based on ISIP. That is, cross-word tree-based tied-state
triphones for acoustic models and back-off bigram language model.
_Some results using the HTK setup_
Parameters (39): 12 MFCC plus C0 with CMS and delta and acceleration
coefficients (MFCC_0_D_A_Z)
Training clean models from scratch takes only 3h52' on a 2.66GHz
Pentium 4 (training set contains 15h of speech data).
Decoding on the 166 utterances tests (1.231s of speech) on a 2.66GHz
Pentium
test_01 (clean data)
- Word Error Rate = 13.22% Decoding Time = 3.428s (2.78 xRT)
test_02 (car noise)
- Word Error Rate = 24.68% Decoding Time = 8.002s (6.50 xRT)
test_03 (babble noise)
- Word Error Rate = 46.00% Decoding Time = 13.747s (11.17 xRT)
_Same results for the ISIP setup_
Parameters (39): Standard FE on a 2.66GHz Pentium
test_01 (clean data)
- Word Error Rate = 16.2% Decoding Time = 7.580s (6.16 xRT)
test_02 (car noise)
- Word Error Rate = 49.6% Decoding Time = 22.195s (18.03 xRT)
test_03 (babble noise)
- Word Error Rate = 62.2% Decoding Time = 33.203s (26.9xRT)
This preliminary version was evaluated by TUC and based on these
preliminary results, UGR and TUC had decided to jointly develop a
reference framework based on HTK to be used inside HIWIRE and
distributed to all partners.
_More information & downloads_
The distribution consits on a set of scripts and all files needed to
build and evaluate a baseline system for AURORA 4.
The current distribution also contains support for decoding using TUC generated lattices (see
AuroraFour)
More information can be found on the first version of the manual:
The first version of the distribution can be downloaded from the
DataSharing section of this web (restricted to HIWIRE partners) folowing the
_WSJ_HTK_scripts_v4_ link (please see the README file for a description of included files).
--
JoseCSegura - 20 Dec 2004