Numpy根据阈值更改元素，然后逐元素添加

for i in range(LOWER, UPPER + 1): fname = file_name+str(i)+".txt" cur_resfile = np.genfromtxt(fname, delimiter = ",", skiprows = 1) m_cur = cur_resfile m_cur[m_cur <= 1] = 0 m_cur[m_cur > 1 ] = 1 m_ongoing = m_ongoing + m_cur

2条回答

网友

1楼 · 编辑于 2024-09-29 01:30:03

正如@RootTwo所建议的，clip（）是一个很好的numpy内置程序。但出于性能原因，可以对数据的三维“堆栈”使用矢量化操作。你知道吗

示例：

import numpy as np
#simulating your data as a list of 3247 2D matrices, each 197x10
some_data = [np.random.randint(-2,2,(197,10)) for _i in range(3247)]
#stack the matrices
X = np.dstack(some_data)
print(X.shape)

(197, 10, 3247)

Y = X.clip(0,1)
Z = Y.sum(axis=2)
#Z is now the output you want!
print(Z.shape)

(197, 10)

编辑：添加计时结果并更改我的答案

因此，我的建议似乎是创建一个深度堆栈并使用clip和sum函数的单个应用程序，这是不明智的。我运行了一些计时测试，发现增量方法更快，这很可能是由于分配大的3D数组的分配时间开销造成的。你知道吗

下面是测试，我将数据加载方面分解出来，因为这两种方式都是一样的。下面是ipython中使用%timeit宏比较这两种方法的结果。你知道吗

import numpy as np
# some_data is simulated as in the above code sample
def f1(some_data):
    x = some_data[0]
    x = x.clip(0,1)
    for y in some_data[1:]:
        x += y.clip(0,1)
    return x

def f2(some_data):
    X = np.dstack(some_data)
    X = X.clip(0,1)
    X = X.sum(axis=2)
    return X

%timeit x1 = f1(some_data)

10 loops, best of 3: 28.1 ms per loop

%timeit x2 = f2(some_data)

10 loops, best of 3: 103 ms per loop

因此，这是一个3.7倍的加速通过做这个过程的增量，而不是作为一个单一的操作后，堆叠的数据。你知道吗

网友
2楼 · 编辑于 2024-09-29 01:30:03

你可以用numpy.clip()
for i in range(LOWER, UPPER + 1): fname = file_name+str(i)+".txt" cur_resfile = np.genfromtxt(fname, delimiter = ",", skiprows = 1) m_ongoing += cur_resfile.clip(0,1)
编辑回答提问：
m_ongoing = np.zeros((197,10)) for i in range(LOWER, UPPER + 1): fname = file_name+str(i)+".txt" cur_resfile = np.genfromtxt(fname, delimiter = ",", skiprows = 1) # add one to the places where cur_file > 1 m_ongoing[cur_resfile > 1] += 1

相关问题更多 >

编程相关推荐

热门问题

热门文章