Theano:如何有效地撤消/逆转最大池

2024-10-02 02:42:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用ano0.7创建一个convolutional neural net,它使用max-pooling(即通过只保留局部极大值来缩小矩阵)。在

为了“撤消”或“反转”最大池步骤,一种方法是将最大值的位置存储为辅助数据,然后简单地重新创建未合并的数据,方法是生成一个大的零数组,并使用这些辅助位置将最大值放置在其适当的位置。在

以下是我目前的做法:

import numpy as np
import theano
import theano.tensor as T

minibatchsize = 2
numfilters = 3
numsamples = 4
upsampfactor = 5

# HERE is the function that I hope could be improved
def upsamplecode(encoded, auxpos):
    shp = encoded.shape
    upsampled = T.zeros((shp[0], shp[1], shp[2] * upsampfactor))
    for whichitem in range(minibatchsize):
        for whichfilt in range(numfilters):
            upsampled = T.set_subtensor(upsampled[whichitem, whichfilt, auxpos[whichitem, whichfilt, :]], encoded[whichitem, whichfilt, :])
    return upsampled


totalitems = minibatchsize * numfilters * numsamples

code = theano.shared(np.arange(totalitems).reshape((minibatchsize, numfilters, numsamples)))

auxpos = np.arange(totalitems).reshape((minibatchsize, numfilters, numsamples)) % upsampfactor  # arbitrary positions within a bin
auxpos += (np.arange(4) * 5).reshape((1,1,-1)) # shifted to the actual temporal bin location
auxpos = theano.shared(auxpos.astype(np.int))

print "code:"
print code.get_value()
print "locations:"
print auxpos.get_value()
get_upsampled = theano.function([], upsamplecode(code, auxpos))
print "the un-pooled data:"
print get_upsampled()

(By the way, in this case I have a 3D tensor, and it's only the third axis that gets max-pooled. People who work with image data might expect to see two dimensions getting max-pooled.)

输出为:

^{pr2}$

这个方法有效但它是一个瓶颈,占用了我计算机的大部分时间(我认为set\u子传感器调用可能意味着cpu<;->;gpu数据复制)。那么:这能否更有效地实施?在

我怀疑有一种方法可以将其表示为单个set_subtensor()调用,这可能更快,但我不知道如何使张量索引正确地广播。在


更新:我想到了一种方法,通过研究展平张量来实现:

def upsamplecode2(encoded, auxpos):
    shp = encoded.shape
    upsampled = T.zeros((shp[0], shp[1], shp[2] * upsampfactor))

    add_to_flattened_indices = theano.shared(np.array([ [[(y + z * numfilters) * numsamples * upsampfactor for x in range(numsamples)] for y in range(numfilters)] for z in range(minibatchsize)], dtype=theano.config.floatX).flatten(), name="add_to_flattened_indices")

    upsampled = T.set_subtensor(upsampled.flatten()[T.cast(auxpos.flatten() + add_to_flattened_indices, 'int32')], encoded.flatten()).reshape(upsampled.shape)

    return upsampled


get_upsampled2 = theano.function([], upsamplecode2(code, auxpos))
print "the un-pooled data v2:"
ups2 = get_upsampled2()
print ups2

但是,这仍然不是很好的效率,因为当我运行这个(添加到上述脚本的末尾)时,我发现Cuda库当前无法有效地进行整数索引操作:

ERROR (theano.gof.opt): Optimization failure due to: local_gpu_advanced_incsubtensor1
ERROR (theano.gof.opt): TRACEBACK:
ERROR (theano.gof.opt): Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/opt.py", line 1493, in process_node
    replacements = lopt.transform(node)
  File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/opt.py", line 952, in local_gpu_advanced_incsubtensor1
    gpu_y = gpu_from_host(y)
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 507, in __call__
    node = self.make_node(*inputs, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/basic_ops.py", line 133, in make_node
    dtype=x.dtype)()])
  File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/type.py", line 69, in __init__
    (self.__class__.__name__, dtype, name))
TypeError: CudaNdarrayType only supports dtype float32 for now. Tried using dtype int64 for variable None

Tags: thetoinforlocalnptheanoprint
1条回答
网友
1楼 · 发布于 2024-10-02 02:42:31

我不知道这是否更快,但可能会更简洁一点。看看这对你的案子是否有用。在

import numpy as np
import theano
import theano.tensor as T

minibatchsize = 2
numfilters = 3
numsamples = 4
upsampfactor = 5

totalitems = minibatchsize * numfilters * numsamples

code = np.arange(totalitems).reshape((minibatchsize, numfilters, numsamples))

auxpos = np.arange(totalitems).reshape((minibatchsize, numfilters, numsamples)) % upsampfactor 
auxpos += (np.arange(4) * 5).reshape((1,1,-1))

# first in numpy
shp = code.shape
upsampled_np = np.zeros((shp[0], shp[1], shp[2] * upsampfactor))
upsampled_np[np.arange(shp[0]).reshape(-1, 1, 1), np.arange(shp[1]).reshape(1, -1, 1), auxpos] = code

print "numpy output:"
print upsampled_np

# now the same idea in theano
encoded = T.tensor3()
positions = T.tensor3(dtype='int64')
shp = encoded.shape
upsampled = T.zeros((shp[0], shp[1], shp[2] * upsampfactor))
upsampled = T.set_subtensor(upsampled[T.arange(shp[0]).reshape((-1, 1, 1)), T.arange(shp[1]).reshape((1, -1, 1)), positions], encoded)

print "theano output:"
print upsampled.eval({encoded: code, positions: auxpos})

相关问题 更多 >

    热门问题