乘以路缘石中两层的输出问题的回答

乘以路缘石中两层的输出

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我试图用这种设计在keras中实现一个神经（ish）网络：<a href="http://nlp.cs.rpi.edu/paper/AAAI15.pdf" rel="noreferrer">http://nlp.cs.rpi.edu/paper/AAAI15.pdf</a> 算法基本上有三个输入。输入2和输入3乘以相同的重量矩阵W1，生成O2和O3。输入1乘以W2得到O1。然后，我们需要取O1*O2和O1*O3的点积。 我正试图在凯拉斯实现这一点。 我的第一个想法是使用keras <code>Graph</code>类，并使W1成为一个具有两个输入和两个输出的共享节点层。到目前为止还不错。 然后问题就出现了，如何用O1取这两个输出的点积。 我试图定义一个自定义函数： <pre><code> def layer_mult(X, Y): return K.dot(X * K.transpose(Y)) </code></pre> 然后： <pre><code>ntm.add_node(Lambda(layer_mult, output_shape = (1,1)), name = "ls_pos", inputs = ["O1", "O2"]) ntm.add_node(Lambda(layer_mult, output_shape = (1,1)), name = "ls_neg", inputs = ["O1", "O3"]) </code></pre> 编译时出现的问题是，keras只想给Lambda层一个输入： <pre><code> 1045 func = types.FunctionType(func, globals()) 1046 if hasattr(self, 'previous'): -> 1047 return func(self.previous.get_output(train)) 1048 else: 1049 return func(self.input) TypeError: layer_mult() takes exactly 2 arguments (1 given) </code></pre> 我认为另一种选择可能是使用<code>Merge</code>类，它将<code>dot</code>作为允许的合并类型。但是，<code>Merge</code>类的输入层必须传递给构造函数。因此，似乎没有办法将共享节点的输出添加到<code>Merge</code>中，以将<code>Merge</code>添加到<code>Graph</code>。 如果我使用的是<code>Sequential</code>容器，我可以将它们放入<code>Merge</code>。但是，这样就没有办法实现两个<code>Sequential</code>层需要共享相同的权重矩阵。 我想尝试将O1、O2和O3连接成一个向量作为输出层，然后在目标函数中进行乘法运算。但是，这将需要目标函数分割其输入，这在keras中似乎是不可能的（相关的Theano函数不会传递到keras API）。 有什么解决办法吗？ 编辑： 我想我已经取得了一些进展，因为我发现<code>shared_node</code>正在实现<code>dot</code>（即使文档中没有）。 所以我必须： <pre><code>ntm = Graph() ntm.add_input(name='g', input_shape=(300,)) # Vector of 300 units, normally distributed around zero ntm.add_node([pretrained bit], name = "lt", input = "g") # 300 * 128, output = (,128) n_docs = 1000 ntm.add_input("d_pos", input_shape = (n_docs,)) # (,n_docs) ntm.add_input("d_neg", input_shape = (n_docs,)) # (,n_docs) ntm.add_shared_node(Dense(128, activation = "softmax", # weights = pretrained_W1, W_constraint = unitnorm(), W_regularizer = l2(0.001) ), name = "ld", inputs = ["d_pos", "d_neg"], outputs = ["ld_pos", "ld_neg"], merge_mode=None) # n_docs * 128, output = (,128) * 2 ntm.add_shared_node(ActivityRegularization(0,0), #ActivityRegularization is being used as a passthrough - the function of the node is to dot* its inputs name = "ls_pos", inputs = ["lt", "d_pos"], merge_mode = 'dot') # output = (,1) ntm.add_shared_node(ActivityRegularization(0,0), name = "ls_neg", inputs = ["lt", "d_neg"], merge_mode = 'dot') # output = (,1) ntm.add_shared_node(ActivityRegularization(0,0), name = "summed", inputs = ["ls_pos", "ls_neg"], merge_mode = 'sum') # output = (,1) ntm.add_node(ThresholdedReLU(0.5), input = "summed", name = "loss") # output = (,1) ntm.add_output(name = "loss_out", input= "loss") def obj(X, Y): return K.sum(Y) ntm.compile(loss = {'loss_out' : obj}, optimizer = "sgd") </code></pre> 现在的错误是： <pre><code>>>> ntm.compile(loss = {'loss_out' : obj}, optimizer = "sgd") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "build/bdist.macosx-10.5-x86_64/egg/keras/models.py", line 602, in compile File "build/bdist.macosx-10.5-x86_64/egg/keras/layers/advanced_activations.py", line 149, in get_output File "build/bdist.macosx-10.5-x86_64/egg/keras/layers/core.py", line 117, in get_input File "build/bdist.macosx-10.5-x86_64/egg/keras/layers/core.py", line 1334, in get_output File "build/bdist.macosx-10.5-x86_64/egg/keras/layers/core.py", line 1282, in get_output_sum File "build/bdist.macosx-10.5-x86_64/egg/keras/layers/core.py", line 1266, in get_output_at File "build/bdist.macosx-10.5-x86_64/egg/keras/layers/core.py", line 730, in get_output File "build/bdist.macosx-10.5-x86_64/egg/keras/layers/core.py", line 117, in get_input File "build/bdist.macosx-10.5-x86_64/egg/keras/layers/core.py", line 1340, in get_output File "build/bdist.macosx-10.5-x86_64/egg/keras/layers/core.py", line 1312, in get_output_dot File "/Volumes/home500/anaconda/envs/[-]/lib/python2.7/site-packages/theano/tensor/var.py", line 360, in dimshuffle pattern) File "/Volumes/home500/anaconda/envs/[-]/lib/python2.7/site-packages/theano/tensor/elemwise.py", line 164, in __init__ (input_broadcastable, new_order)) ValueError: ('You cannot drop a non-broadcastable dimension.', ((False, False, False, False), (0, 'x'))) </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

乘以路缘石中两层的输出

1 个回答

相关Python问题