所以我尝试了TensorFlow的急切执行,但是我的实现没有成功。我使用了gradient.tape
,当程序运行时,任何权重都没有可见的更新。我看过一些示例算法和教程使用optimizer.apply_gradients()
来更新所有变量,但我假设我没有正确使用它。在
import tensorflow as tf
import tensorflow.contrib.eager as tfe
# emabling eager execution
tf.enable_eager_execution()
# establishing hyperparameters
LEARNING_RATE = 20
TRAINING_ITERATIONS = 3
# establishing all LABLES
LABELS = tf.constant(tf.random_normal([3, 1]))
# print(LABELS)
# stub statment for input
init = tf.Variable(tf.random_normal([3, 1]))
# declare and intialize all weights
weight1 = tfe.Variable(tf.random_normal([2, 3]))
bias1 = tfe.Variable(tf.random_normal([2, 1]))
weight2 = tfe.Variable(tf.random_normal([3, 2]))
bias2 = tfe.Variable(tf.random_normal([3, 1]))
weight3 = tfe.Variable(tf.random_normal([2, 3]))
bias3 = tfe.Variable(tf.random_normal([2, 1]))
weight4 = tfe.Variable(tf.random_normal([3, 2]))
bias4 = tfe.Variable(tf.random_normal([3, 1]))
weight5 = tfe.Variable(tf.random_normal([3, 3]))
bias5 = tfe.Variable(tf.random_normal([3, 1]))
VARIABLES = [weight1, bias1, weight2, bias2, weight3, bias3, weight4, bias4, weight5, bias5]
def thanouseEyes(input): # nn model aka: Thanouse's Eyes
layerResult = tf.nn.relu(tf.matmul(weight1, input) + bias1)
input = layerResult
layerResult = tf.nn.relu(tf.matmul(weight2, input) + bias2)
input = layerResult
layerResult = tf.nn.relu(tf.matmul(weight3, input) + bias3)
input = layerResult
layerResult = tf.nn.relu(tf.matmul(weight4, input) + bias4)
input = layerResult
layerResult = tf.nn.softmax(tf.matmul(weight5, input) + bias5)
return layerResult
# Begin training and update variables
optimizer = tf.train.AdamOptimizer(LEARNING_RATE)
with tf.GradientTape(persistent=True) as tape: # gradient calculation
for i in range(TRAINING_ITERATIONS):
COST = tf.reduce_sum(LABELS - thanouseEyes(init))
GRADIENTS = tape.gradient(COST, VARIABLES)
optimizer.apply_gradients(zip(GRADIENTS, VARIABLES))
print(weight1)
optimizer
的用法似乎很好,但是无论变量如何,thanouseEyes()
定义的计算都将始终返回[1,1,1.],因此梯度始终为0,因此变量永远不会更新(print(thanouseEyes(init))
和print(GRADIENTS)
应该证明这一点)。在再深入一点,
tf.nn.softmax
被应用于形状为[3,1]的x = tf.matmul(weight5, input) + bias5
。{{{8}计算{8}实际上适用于cd7}轴。x[0]
、x[1]
和x[2]
是一个元素的向量,因此softmax(x[i])
将始终为1.0。在希望有帮助。在
您可能感兴趣的与您的问题无关的其他要点:
从TensorFlow1.11开始,程序中不需要}),您将得到相同的结果
tf.contrib.eager
模块。将所有出现的tfe
替换为tf
(即,tf.Variable
而不是{在
GradientTape
上下文中执行的计算是“记录”的,即它保留中间张量,以便以后可以计算梯度。长话短说,您应该将GradientTape
移到循环体中:-
相关问题 更多 >
编程相关推荐