我的蒙特卡罗辍学模型是否应该提供与确定性预测类似的平均预测?

2024-05-13 18:40:59 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个用多个LayerNormalization层训练的模型,我不确定在激活退出预测时简单的权重转移是否正常工作。这是我正在使用的代码:

from tensorflow.keras.models import load_model, Model
from tensorflow.keras.layers import Dense, Dropout, LayerNormalization, Input

model0 = load_model(path + 'model0.h5')
OW = model0.get_weights()

inp = Input(shape=(10,))
D1 = Dense(760, activation='softplus')(inp)
DO1 = Dropout(0.29)(D1,training=True)
N1 = LayerNormalization()(DO1)
D2 = Dense(460,activation='softsign')(N1)
DO2 = Dropout(0.16)(D2,training=True)
N2 = LayerNormalization()(DO2)
D3 = Dense(664,activation='softsign')(N2)
DO3 = Dropout(0.09)(D3,training=True)
N3 = LayerNormalization()(DO3)
out = Dense(1,activation='linear')(N3)

mP = Model(inp,out)
mP.set_weights(OW)
mP.compile(loss='mse',optimizer='Adam')
mP.save(path + 'new_model.h5')

如果我在退出层上设置training=False,模型将做出与原始模型相同的预测。然而,当代码如上所述编写时,平均预测不接近原始/确定性预测

我以前开发的辍学者训练模型的平均概率预测与确定性模型几乎相同。是否有我做得不正确的地方,或者这是使用LayerNormalization和active dropout的问题?据我所知,LayerNormalization有可训练的参数,所以我不知道主动辍学是否会干扰这一点。如果是这样,我不知道如何补救

这段代码用于运行快速测试并绘制结果:

inputs = np.zeros(shape=(1,10),dtype='float32')
inputsP = np.zeros(shape=(1000,10),dtype='float32')
opD = mD.predict(inputs)[0,0]
opP = mP.predict(inputsP).reshape(1000)
print('Deterministic: %.4f   Probabilistic: %.4f' % (opD,np.mean(opP)))

plt.scatter(0,opD,color='black',label='Det',zorder=3)
plt.scatter(0,np.mean(opP),color='red',label='Mean prob',zorder=2)
plt.errorbar(0,np.mean(opP),yerr=np.std(opP),color='red',zorder=2,markersize=0, capsize=20,label=r'$\sigma$ bounds')
plt.grid(axis='y',zorder=0)
plt.legend()
plt.tick_params(axis='x',labelsize=0,labelcolor='white',color='white',width=0,length=0)

结果输出和绘图如下所示

Deterministic: -0.9732 Probabilistic: -0.9011

uncertainty results


Tags: 代码模型modelnptrainingpltmpactivation
1条回答
网友
1楼 · 发布于 2024-05-13 18:40:59

编辑我的答案:

我认为问题在于模型的抽样不足。预测的标准偏差与辍学率直接相关,因此,近似决策模型所需的预测数量也会增加。如果您对下面的代码进行荒谬的测试,但将每个脱落层的脱落设置为0.7,则100000个样本不再足以将确定性平均值近似到10^-3以内,并且预测的标准偏差变得更大

import os

import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Dropout, Input

os.environ['CUDA_VISIBLE_DEVICES'] = '0'
GPUs = tf.config.experimental.list_physical_devices('GPU')
for gpu in GPUs:
    tf.config.experimental.set_memory_growth(gpu, True)

inp = Input(shape=(10,))
D1 = Dense(760, activation='softplus')(inp)
D2 = Dense(460, activation='softsign')(D1)
D3 = Dense(664, activation='softsign')(D2)
out = Dense(1, activation='linear')(D3)

mP = Model(inp, out)
mP.compile(loss='mse', optimizer='Adam')

inp = Input(shape=(10,))
D1 = Dense(760, activation='softplus')(inp)
DO1 = Dropout(0.29)(D1,training=False)
D2 = Dense(460, activation='softsign')(DO1)
DO2 = Dropout(0.16)(D2,training=True)
D3 = Dense(664, activation='softsign')(DO2)
DO3 = Dropout(0.09)(D3,training=True)
out = Dense(1, activation='linear')(DO3)

mP2 = Model(inp, out)
mP2.set_weights(mP.get_weights())
mP2.compile(loss='mse', optimizer='Adam')

data = np.zeros(shape=(100000, 10),dtype='float32')
res = mP.predict(data).reshape(data.shape[0])
res2 = mP2.predict(data).reshape(data.shape[0])

print (np.abs(res[0] - res2.mean()))

相关问题 更多 >