为什么要为结构化数据使用递归神经网络？

1条回答

网友

1楼 · 发布于 2024-06-26 05:58:15

在实践中，即使在NLP中，您也会看到rnn和cnn经常是竞争的。Here's一篇2017年的评论文章，更详细地展示了这一点。从理论上讲，RNN可以更好地处理语言的全部复杂性和顺序性，但在实践中，更大的障碍通常是对网络进行适当的训练，而RNN是挑剔的。在

另一个有可能奏效的问题是看一个像平衡括号问题那样的问题（要么在字符串中只使用括号，要么在括号中加上其他干扰字符）。这需要按顺序处理输入并跟踪某些状态，使用LSTM比FFN更容易学习。在

更新：有些看起来顺序的数据实际上可能不必按顺序处理。例如，即使您提供了一系列要加的数字，因为加法是可交换的，FFN也可以和RNN一样好。在许多健康问题中，支配性信息不是顺序性的，也是如此。假设每年都要测量病人的吸烟习惯。从行为学的角度来看，这个轨迹是很重要的，但是如果你要预测病人是否会患肺癌，那么这个预测将取决于病人吸烟的年数（对于FFN来说，可能只限于过去10年）。在

所以你想让玩具问题更复杂，并且需要考虑数据的顺序。也许是某种模拟的时间序列，你想预测数据中是否有峰值，但你不关心绝对值，只关心峰值的相对性质。在

更新2

我修改了你的代码来展示RNN性能更好的例子。诀窍是使用更复杂的条件逻辑，在LSTMs中比ffn更自然地建模。代码如下。对于8列，我们看到FFN在1分钟内训练，验证损失达到6.3。LSTM的训练时间延长了3倍，但最终验证损失在1.06时降低了6倍。在

随着列数的增加，LSTM具有越来越大的优势，特别是如果我们在中添加了更复杂的条件。对于16列，FFNs验证损失为19（您可以更清楚地看到训练曲线，因为模型无法立即拟合数据）。相比之下，LSTM的训练时间要长11倍，但验证损失为0.31，比FFN小30倍！你可以用更大的矩阵来看看这个趋势会延伸到什么程度。在

from keras import models
from keras import layers

from keras.layers import Dense, LSTM

import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import time

matplotlib.use('Agg')

np.random.seed(20180908)

rows = 20500
cols = 10

# Randomly generate Z
Z = 100*np.random.uniform(0.05, 1.0, size = (rows, cols))

larger = np.max(Z[:, :cols/2], axis=1).reshape((rows, 1))
larger2 = np.max(Z[:, cols/2:], axis=1).reshape((rows, 1))
smaller = np.min((larger, larger2), axis=0)
# Z is now the max of the first half of the array.
Z = np.append(Z, larger, axis=1)
# Z is now the min of the max of each half of the array.
# Z = np.append(Z, smaller, axis=1)

# Combine and shuffle.

#Z = np.concatenate((Z_sum, Z_avg), axis = 0)

np.random.shuffle(Z)

## Training and validation data.

split = 10000

X_train = Z[:split, :-1]
X_valid = Z[split:, :-1]
Y_train = Z[:split, -1:].reshape(split, 1)
Y_valid = Z[split:, -1:].reshape(rows - split, 1)

print(X_train.shape)
print(Y_train.shape)
print(X_valid.shape)
print(Y_valid.shape)

print("Now setting up the FNN")

## FNN model.

tick = time.time()

# Define model.

network_fnn = models.Sequential()
network_fnn.add(layers.Dense(32, activation = 'relu', input_shape = (X_train.shape[1],)))
network_fnn.add(Dense(1, activation = None))

# Compile model.

network_fnn.compile(optimizer = 'adam', loss = 'mean_squared_error')

# Fit model.

history_fnn = network_fnn.fit(X_train, Y_train, epochs = 500, batch_size = 128, verbose = False,
    validation_data = (X_valid, Y_valid))

tock = time.time()

print()
print(str('%.2f' % ((tock - tick) / 60)) + ' minutes.')

print("Now evaluating the FNN")

loss_fnn = history_fnn.history['loss']
val_loss_fnn = history_fnn.history['val_loss']
epochs_fnn = range(1, len(loss_fnn) + 1)
print("train loss: ", loss_fnn[-1])
print("validation loss: ", val_loss_fnn[-1])

plt.plot(epochs_fnn, loss_fnn, 'black', label = 'Training Loss')
plt.plot(epochs_fnn, val_loss_fnn, 'red', label = 'Validation Loss')
plt.title('FNN: Training and Validation Loss')
plt.legend()
plt.show()

plt.scatter(Y_train, network_fnn.predict(X_train), alpha = 0.1)
plt.xlabel('Actual')
plt.ylabel('Predicted')
plt.title('training points')
plt.show()

plt.scatter(Y_valid, network_fnn.predict(X_valid), alpha = 0.1)
plt.xlabel('Actual')
plt.ylabel('Predicted')
plt.title('valid points')
plt.show()

print("LSTM")

## LSTM model.

X_lstm_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
X_lstm_valid = X_valid.reshape(X_valid.shape[0], X_valid.shape[1], 1)

tick = time.time()

# Define model.

network_lstm = models.Sequential()
network_lstm.add(layers.LSTM(32, activation = 'relu', input_shape = (X_lstm_train.shape[1], 1)))
network_lstm.add(layers.Dense(1, activation = None))

# Compile model.

network_lstm.compile(optimizer = 'adam', loss = 'mean_squared_error')

# Fit model.

history_lstm = network_lstm.fit(X_lstm_train, Y_train, epochs = 500, batch_size = 128, verbose = False,
    validation_data = (X_lstm_valid, Y_valid))

tock = time.time()

print()
print(str('%.2f' % ((tock - tick) / 60)) + ' minutes.')

print("now eval")

loss_lstm = history_lstm.history['loss']
val_loss_lstm = history_lstm.history['val_loss']
epochs_lstm = range(1, len(loss_lstm) + 1)
print("train loss: ", loss_lstm[-1])
print("validation loss: ", val_loss_lstm[-1])

plt.plot(epochs_lstm, loss_lstm, 'black', label = 'Training Loss')
plt.plot(epochs_lstm, val_loss_lstm, 'red', label = 'Validation Loss')
plt.title('LSTM: Training and Validation Loss')
plt.legend()
plt.show()

plt.scatter(Y_train, network_lstm.predict(X_lstm_train), alpha = 0.1)
plt.xlabel('Actual')
plt.ylabel('Predicted')
plt.title('training')
plt.show()

plt.scatter(Y_valid, network_lstm.predict(X_lstm_valid), alpha = 0.1)
plt.xlabel('Actual')
plt.ylabel('Predicted')
plt.title("validation")
plt.show()

相关问题更多 >

编程相关推荐

热门问题

热门文章