有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

javalstm与密集层预处理

我试图用LSTM和密集层构建NN

Me net是:

 MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
            .seed(123)    
            .weightInit(WeightInit.XAVIER)
            .updater(new Adam(0.1))
            .list()
            .layer(0,  new LSTM.Builder().activation(Activation.TANH).nIn(numInputs).nOut(120).build())
            .layer(1, new DenseLayer.Builder().activation(Activation.RELU).nIn(120).nOut(1000).build())
            .layer(2, new DenseLayer.Builder().activation(Activation.RELU).nIn(1000).nOut(20).build())
            .layer(new OutputLayer.Builder(LossFunction.NEGATIVELOGLIKELIHOOD).activation(Activation.SOFTMAX).nIn(20).nOut(numOutputs).build())
            .inputPreProcessor(1, new RnnToFeedForwardPreProcessor())
            .build();

我是这样读数据的:

 SequenceRecordReader reader = new CSVSequenceRecordReader(0, ",");
        reader.initialize(new NumberedFileInputSplit("TRAIN_%d.csv", 1, 17476));
        DataSetIterator trainIter = new SequenceRecordReaderDataSetIterator(reader, miniBatchSize, 6, 7, false);
        allData = trainIter.next();


        //Load the test/evaluation data:
        SequenceRecordReader testReader = new CSVSequenceRecordReader(0, ",");
        testReader.initialize(new NumberedFileInputSplit("TEST_%d.csv", 1, 8498));
        DataSetIterator testIter = new SequenceRecordReaderDataSetIterator(testReader, miniBatchSize, 6, 7, false);
        allData = testIter.next();

因此,当它进入网络时,它有形状[批次、特征、时间戳]=[32,7,60] 我可以用这样的特殊错误来定义它:

Received input with size(1) = 7 (input array shape = [32, 7, 60]); input.size(1) must match layer nIn size (nIn = 9)

所以它通常会上网。在第一个LSTM层之后,它必须重塑为二维,然后再进行密集层

但我还有一个问题:

Labels and preOutput must have equal shapes: got shapes [32, 6, 60] vs [1920, 6]

它在进入致密层之前没有重塑,我错过了一个特征(现在形状是32,6,60,而不是32,7,60),为什么


共 (1) 个答案

  1. # 1 楼答案

    如果可能的话,您需要使用setInputType,它将为您设置预处理器

    以下是lstm到dense的配置示例:

     MultiLayerConfiguration conf1 = new NeuralNetConfiguration.Builder()
                        .trainingWorkspaceMode(wsm)
                        .inferenceWorkspaceMode(wsm)
                        .seed(12345)
                        .updater(new Adam(0.1))
                        .list()
                        .layer(new LSTM.Builder().nIn(3).nOut(3).dataFormat(rnnDataFormat).build())
                        .layer(new DenseLayer.Builder().nIn(3).nOut(3).activation(Activation.TANH).build())
                        .layer(new RnnOutputLayer.Builder().nIn(3).nOut(3).activation(Activation.SOFTMAX).dataFormat(rnnDataFormat)
                                .lossFunction(LossFunctions.LossFunction.MCXENT).build())
                        .setInputType(InputType.recurrent(3, rnnDataFormat))
                        .build();
    

    RNN格式为:

    import org.deeplearning4j.nn.conf.RNNFormat;
    
    

    这是一个枚举,用于指定数据格式(最后一个通道或第一个通道) 从javadoc:

    /**
     * NCW = "channels first" - arrays of shape [minibatch, channels, width]<br>
     * NWC = "channels last" - arrays of shape [minibatch, width, channels]<br>
     * "width" corresponds to sequence length and "channels" corresponds to sequence item size.
     */
    

    来源:https://github.com/eclipse/deeplearning4j/blob/1930d9990810db6214829c716c2ae7eb7f59cd13/deeplearning4j/deeplearning4j-nn/src/main/java/org/deeplearning4j/nn/conf/RNNFormat.java#L21

    在我们的测试中有更多内容:https://github.com/eclipse/deeplearning4j/blob/1930d9990810db6214829c716c2ae7eb7f59cd13/deeplearning4j/deeplearning4j-core/src/test/java/org/deeplearning4j/nn/layers/recurrent/TestTimeDistributed.java#L58