javalstm与密集层预处理
我试图用LSTM和密集层构建NN
Me net是:
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.seed(123)
.weightInit(WeightInit.XAVIER)
.updater(new Adam(0.1))
.list()
.layer(0, new LSTM.Builder().activation(Activation.TANH).nIn(numInputs).nOut(120).build())
.layer(1, new DenseLayer.Builder().activation(Activation.RELU).nIn(120).nOut(1000).build())
.layer(2, new DenseLayer.Builder().activation(Activation.RELU).nIn(1000).nOut(20).build())
.layer(new OutputLayer.Builder(LossFunction.NEGATIVELOGLIKELIHOOD).activation(Activation.SOFTMAX).nIn(20).nOut(numOutputs).build())
.inputPreProcessor(1, new RnnToFeedForwardPreProcessor())
.build();
我是这样读数据的:
SequenceRecordReader reader = new CSVSequenceRecordReader(0, ",");
reader.initialize(new NumberedFileInputSplit("TRAIN_%d.csv", 1, 17476));
DataSetIterator trainIter = new SequenceRecordReaderDataSetIterator(reader, miniBatchSize, 6, 7, false);
allData = trainIter.next();
//Load the test/evaluation data:
SequenceRecordReader testReader = new CSVSequenceRecordReader(0, ",");
testReader.initialize(new NumberedFileInputSplit("TEST_%d.csv", 1, 8498));
DataSetIterator testIter = new SequenceRecordReaderDataSetIterator(testReader, miniBatchSize, 6, 7, false);
allData = testIter.next();
因此,当它进入网络时,它有形状[批次、特征、时间戳]=[32,7,60] 我可以用这样的特殊错误来定义它:
Received input with size(1) = 7 (input array shape = [32, 7, 60]); input.size(1) must match layer nIn size (nIn = 9)
所以它通常会上网。在第一个LSTM层之后,它必须重塑为二维,然后再进行密集层
但我还有一个问题:
Labels and preOutput must have equal shapes: got shapes [32, 6, 60] vs [1920, 6]
它在进入致密层之前没有重塑,我错过了一个特征(现在形状是32,6,60,而不是32,7,60),为什么
# 1 楼答案
如果可能的话,您需要使用setInputType,它将为您设置预处理器
以下是lstm到dense的配置示例:
RNN格式为:
这是一个枚举,用于指定数据格式(最后一个通道或第一个通道) 从javadoc:
来源:https://github.com/eclipse/deeplearning4j/blob/1930d9990810db6214829c716c2ae7eb7f59cd13/deeplearning4j/deeplearning4j-nn/src/main/java/org/deeplearning4j/nn/conf/RNNFormat.java#L21
在我们的测试中有更多内容:https://github.com/eclipse/deeplearning4j/blob/1930d9990810db6214829c716c2ae7eb7f59cd13/deeplearning4j/deeplearning4j-core/src/test/java/org/deeplearning4j/nn/layers/recurrent/TestTimeDistributed.java#L58