Python-zip迭代器和Adagrad更新

# perform parameter update with Adagrad for param, dparam, mem in zip([Wxh, Whh, Why, bh, by], [dWxh, dWhh, dWhy, dbh, dby], [mWxh, mWhh, mWhy, mbh, mby]): mem += dparam * dparam param += -learning_rate * dparam / np.sqrt(mem + 1e-8) # adagrad update

3条回答

网友

1楼 · 编辑于 2024-09-27 18:05:23

用zip编写一个简单的for循环可以帮助你学到很多东西。你知道吗

例如：

for a, b, c in zip([1,2,3],
                    [4,5,6],
                    [7,8,9]):
    print a
    print b
    print c
    print "/"

此功能将打印：1 4 7/2 5 8/3 6 7

因此zip函数只需将这三个列表放在一起，然后使用三个变量param，dparam，mem来引用不同的列表。你知道吗

在每次迭代中，这三个变量引用它们相应列表中的特定项，就像for i in [1, 2, 3]:。你知道吗

这样，您只需要为循环编写一个而不是三个，就可以更新每个参数的梯度：Wxh、Whh、Why、bh、by。你知道吗

在第一次迭代中，只有Wxh按照adagrad规则使用dWxh和mWxh进行更新。其次，使用dWhh和mWhh更新Whh，以此类推。你知道吗

网友

2楼 · 编辑于 2024-09-27 18:05:23

任何序列（或iterable）都可以通过简单的赋值操作解压成变量。唯一的要求是变量的数量和结构与序列相匹配。例如：

t = (2, 4)
x, y = t

在本例中，标准文档中的zip（）是“zip（），生成一个迭代器，从每个iterables.返回一种元组迭代器，其中第i个元组包含每个参数序列或iterables的第i个元素。所以，你的案子

for param, dparam, mem in zip([Wxh, Whh, Why, bh, by], 
                              [dWxh, dWhh, dWhy, dbh, dby], 
                              [mWxh, mWhh, mWhy, mbh, mby]):
    mem += dparam * dparam
    param += -learning_rate * dparam / np.sqrt(mem + 1e-8)

lets say:
iterable1 = [Wxh, Whh, Why, bh, by]
iterable2 = [dWxh, dWhh, dWhy, dbh, dby]
iterable3 = [mWxh, mWhh, mWhy, mbh, mby]

here zip() returns [(Wxh, dWxh, mWxh), (Whh, dWhh, mWhh), (Why, dWhy, mWhy), (bh, dbh, mbh), (by, dby, mby)]

on 1st iteration:
param, dparam, mem = (Wxh, dWxh, mWxh)
so, 
param = Wxh
dparam = dWxh
mem = mWxh
mem = mem + (dparam * dparam) = mWxh + (dWxh * dWxh)
param = param + (-learning_rate * dparam / np.sqrt(mem + 1e-8)) = Wxh + (-learning_rate * dWxh / np.sqrt(mWxh + (dWxh * dWxh) + 1e-8)

on 2nd iteration:
param, dparam, mem = (Whh, dWhh, mWhh)
so, 
param = Whh
dparam = dWhh
mem = mWhh
an so on.

网友

3楼 · 编辑于 2024-09-27 18:05:23

zip做什么？你知道吗

引用官方文件：

Zip returns a list of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables. The returned list is truncated in length to the length of the shortest argument sequence.

意思是

 >>> zip(["A", "B"], ["C", "D"], ["E", "F"])
 [('A', 'C', 'E'), ('B', 'D', 'F')]

所以现在，当你循环时，你实际上有一个元组列表。像这样的内容。你知道吗

 # These are strings here but in your case these are objects
 [('Wxh', 'dWxh', 'mWxh'), ('Whh', 'dWhh', 'mWhh'), ('Why', 'dWhy', 'mWhy'),
  ('bh', 'dbh', 'mbh'),('by', 'dby', 'mby')]

What I know so far, 5 Iterations, param == Wxh on the first iteration but not there on...

你是对的，现在让我们来分析你的循环。你知道吗

  for param, dparam, mem in m:
      print(param, dparam, mem)

  # Which prints
('Wxh', 'dWxh', 'mWxh')
('Whh', 'dWhh', 'mWhh')
('Why', 'dWhy', 'mWhy')
('bh', 'dbh', 'mbh')
('by', 'dby', 'mby')

也就是说，在每次迭代中，params得到第0个索引元组值，dparam得到第一个，mem得到第二个。你知道吗

现在，当我在for循环的范围外键入param时，我得到

   >>> param
   'by'

这意味着params仍然保留对by对象的引用。你知道吗

根据官方文件：

The for-loop makes assignments to the variables(s) in the target list. [...] Names in the target list are not deleted when the loop is finished, but if the sequence is empty, they will not have been assigned to at all by the loop.

相关问题更多 >

编程相关推荐

热门问题

热门文章