###BREAKING CHANGE### affecting truncated BPTT

Oct 9, 2015 at 7:28 PM
Hi, there has been a change in the latest version in truncated BPTT. One must now define the Truncated parameter in an outer level where both the reader and SGD can see it. If not, SGD will incorrectly scale the learning rate per minibatch. For example as shown below.

The need for this change was caused by the incorrect overloaded use of the mbsize parameter, which in truncated mode really specifies the truncation size, whereas the minibatch size (=#samples between model updates) is equal to (truncation size * number of parallel sequences). SGD must know the Truncated flag as well to be able to correctly determine the true MB size.

Thanks,

Frank
precision=float
command=speechTrain
deviceId=$DeviceId$

parallelTrain=false

frameMode=false
#############
Truncated=true   #### <== put it out here, not inside the reader
#############

speechTrain=[
  action=train
  modelPath=$RunDir$/models/cntkSpeech.dnn
  deviceId=$DeviceId$
  traceLevel=1

  NDLNetworkBuilder=[
      networkDescription=$NDLDir$/lstmp-3layer_WithSelfStab.ndl
  ]    

  SGD=[
      epochSize=20480
      minibatchSize=20
      learningRatesPerMB=0.5
      numMBsToShowResult=10
      momentumPerMB=0:0.9
      maxEpochs=4
      keepCheckPointFiles=true       
  ]
  reader=[
    readerType=HTKMLFReader
    readMethod=blockRandomize
    miniBatchMode=Partial
    nbruttsineachrecurrentiter=32
    randomize=Auto
    verbosity=0
    features=[
        dim=363
        type=Real
        scpFile=$DataDir$/glob_0000.scp
    ]

    labels=[
        mlfFile=$DataDir$/glob_0000.mlf
        labelMappingFile=$DataDir$/state.list

        labelDim=132
        labelType=Category
    ]
  ]
]