Inconsistency in node validation causing error during decoding

Aug 5, 2015 at 5:14 AM
Hi Guys,

In my experiment the training of the network including the validation step works fine. However, during decoding, the same network is giving an error in validating the "Times" operation. I have given the logs for the relevant operation below for both the training and decoding. The problem is caused by a random assignment of the value "235" during decoding as the mini-batch size. However, during training that value is 1. The minibatch is size is 256 in my setup.

During training:

Validating --> CE = CrossEntropyWithSoftmax(regions[774, 1], BFF2.FF.P[774, 1])
Validating --> ObjFcn.T = __Times__(features3[1, 1], CE[1, 1])


During decoding:

Validating --> CE = CrossEntropyWithSoftmax(regions[774, 235], BFF2.FF.P[774, 235])
Validating --> ObjFcn.T = __Times__(features3[1, 235], CE[1, 1]) EXCEPTION occurred: The Matrix dimension in the Times operation does not match.

Is there a way to keep the mini-batch size equals to 1 during the validation step in the decoding? Where is this 235 is coming?

Found a similar issue in the discussions
Thanks,
Lahiru
Coordinator
Aug 5, 2015 at 5:20 AM
Yes, in the decoder setup make sure you set minibatch to 1.
Aug 5, 2015 at 5:40 AM
Setting minibatchSize=1 in the config file seems to have no influence during validation. It keeps assigning 235. I am not sure how CNTK gets this 235? Is there a way to set it to 1 internally?
Coordinator
Aug 5, 2015 at 6:00 AM
It should work by setting minibatchSize=1 in the evaluation block of your config.
Aug 5, 2015 at 6:29 AM
As you can see below, I put minibatchSize=1 in all levels of the config with no luck. I think CNTK doesn't use this value during validation. It randomly assigns a number as the minibatch size. Because, I have set minibatchSize=256 during training and the validation is successful and it uses value 1.

The difference is during training CNTK assigns 1 as the minibatch size for validation whereas during decoding that value is 235. I somehow need to pass the validation step during decoding, but minibatchSize=1 doesn't change the 235.
DeviceNumber=$DeviceNumber$
command=$action$
minibatchSize=1
precision=float


write=[
    action=write
    modelPath=$modelName$
    outputNodeNames=ScaledLogLikelihood
    minibatchSize=1
    # deviceId=-1 for CPU, >=0 for GPU devices
    deviceId=$DeviceNumber$
    traceLevel=1
    useValidation=true
    printValues=true

    reader=[
      # reader to use
      readerType=Kaldi2Reader
      readMethod=blockRandomize
      frameMode=false
      miniBatchMode=Partial
      randomize=Auto
      verbosity=1
      minibatchSize=1

      features1=[
        dim=$featDim1$
        scpFile=$inputCounts$
        rx=$inputFeat1$
      ]

      features2=[
        dim=$featDim2$
        scpFile=$inputCounts$
        rx=$inputFeat2$
      ]

      features3=[
        dim=$featDim3$
        scpFile=$inputCounts$
        rx=$inputFeat3$
      ]

    ]
writer=[
      # reader to use
      writerType=Kaldi2Reader
      readMethod=blockRandomize
      frameMode=false
      miniBatchMode=Partial
      randomize=Auto
      verbosity=1
      minibatchSize=1
      ScaledLogLikelihood=[
        dim=$labelDim$
        Kaldicmd="ark:-"
            scpFile=$inputCounts$
      ]
    ]

]
Coordinator
Aug 5, 2015 at 5:22 PM
It looks like you just want to write out scaledLogLikelihood node? If that's the case the ObjFcn.T node is not needed. A simple workaround is to use model editing language (MEL) to remove this node and save the edited model to a new file and then use that new model to output scaledLogLikelihood.

I will add a new node. With the new node you can revise your NDL so that it can process minibatches directly instead of forcing minibatch size to be 1. This may take time though.
Coordinator
Aug 6, 2015 at 12:48 AM
use the new master branch you can change

CE = CrossEntropyWithSoftmax(regions, BFF2.FF.P)
ObjFcn.T = __Times__(features3[1, 235], CE[1, 1])

to

CE = CrossEntropyWithSoftmax(regions, BFF2.FF.P)
ObjFcn.T = CrossEntropyWithSoftmax(RowElementTimes(ObjFcn.T, regions), BFF2.FF.P)