Questions about Kaldi writer

Jun 16, 2015 at 2:05 AM
Edited Jun 16, 2015 at 2:06 AM
Hello.

I have some questions about Kaldi writer.

I used HTKMLFReader for a reader and KaldiReader for a writer in a TrainSimpleTimit.config file.

(I do my research on a Demos/Speech directory and I used a Makefile_kaldi.gpu file as MakeFile)

In the TrainSimpleTimit.config file, I modified TimitWriteSimple as follows.
writer=[
        writerType=KaldiReader
        readMethod=blockRandomize
        frameMode=false
        miniBatchMode=Partial
        randomize=Auto
        verbosity=1
        ScaledLogLikelihood = [
          dim = 183
          Kaldicmd="ark:TIMIT.cntk.ark"
          scpFile=../Demos/Speech/CntkTimitOutput.scp
 ]
(I saw this code in a README file of a root directory)

and then, I ran the following kaldi command in timit directory of kaldi:
latgen-faster-mapped --max-active=7000 --beam=13.0 --lattice-beam=8.0 --acoustic-scale=0.10 --allow-partial=true --word-symbol-table=exp/tri3/graph/words.txt exp/tri4_nnet/final.mdl exp/tri3/graph/HCLG.fst ark:TIMIT.cntk.ark ark,t:-
and I got the following error in Kaldi:
ERROR (latgen-faster-mapped:DecodableMatrixScaledMapped():decoder/decodable-matrix.h:42) DecodableMatrixScaledMapped: mismatch, matrix has 183 rows but transition-model has 1949 pdf-ids.
ERROR (latgen-faster-mapped:DecodableMatrixScaledMapped():decoder/decodable-matrix.h:42) DecodableMatrixScaledMapped: mismatch, matrix has 183 rows but transition-model has 1949 pdf-ids.

[stack trace: ]
kaldi::KaldiGetStackTrace()
kaldi::KaldiErrorMessage::~KaldiErrorMessage()
kaldi::DecodableMatrixScaledMapped::DecodableMatrixScaledMapped(kaldi::TransitionModel const&, kaldi::Matrix<float> const&, float)
latgen-faster-mapped(main+0x7ac) [0x677f12]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f9dcb033ec5]
latgen-faster-mapped() [0x677692]
In this case, do I have to train my DNN model in Kaldi with 183 pdf-ids?

or train my DNN model in CNTK with 1949 dim?



If I train my DNN model in CNTK with 1949 dim, I think I have to change "labels" in TrainSimpleTimit.config into the following code:
labels=[
        mlfFile=../Demos/Speech/TimitLabels.mlf
        labelDim=1949
        labelMappingFile=../Demos/Speech/TimitStateList.txt
 ]
(a) The number of state in a TimitStateList.txt file is 183.

In this case, how can I modified the number of state in the TimitStateList.txt file?

Should I have to define new states in the TimitStateList.txt file?


(b) I think I also have to modify a TimitLabels.mlf file, because the number of state in the TimitStateList.txt file will be changed into 1949

How can I modified a TimitLabels.mlf file?


Best regards,
Donghyun