TypingError:在nopython模式管道中失败（步骤：nopython前端）float32类型的未知属性“shape”

from numba import cuda, vectorize import numpy as np @cuda.jit(device = True) def pixel_count(img1,img2): count1 = 0 count2 = 0 for i in range(img1.shape[0]): for j in range(img1.shape[1]): if img1[i][j] > 200: count1 = count1 + 1 i = 0; j = 0; for i in range(img2.shape[0]): for j in range(img2.shape[1]): if img2[i][j] > 200: count2 = count2 + 1 return count1, count2 @vectorize(['float32(float32,float32)'], target = 'cuda') def cint(img1, img2): c1, c2 = pixel_count(img1, img2) res = c1-c2 return res A = np.random.rand(480, 640).astype(np.float32)*255 B = np.random.rand(480, 640).astype(np.float32)*255 res = cint(A,B)

File "<ipython-input-33-9169f440975d>", line 8: def pixel_count(img1,img2): <source elided> count2 = 0 for i in range(img1.shape[0]): ^ During: typing of get attribute at <ipython-input-33-9169f440975d> (8) File "<ipython-input-33-9169f440975d>", line 8: def pixel_count(img1,img2): <source elided> count2 = 0 for i in range(img1.shape[0]): ^

@guvectorize(['(float32[:],float32[:], float32)'], '(), () -> ()',target = 'cuda') def cint(img1, img2, res): c1, c2 = pixel_count(img1, img2) res = c1-c2 A = np.random.rand(480, 640).astype(np.float32)*255 B = np.random.rand(480, 640).astype(np.float32)*255 res = cint(A, B)

from numba import cuda, vectorize, guvectorize import numpy as np @cuda.jit(device = True) def pixel_count(img1,img2): count1 = 0 count2 = 0 for i in range(img1.shape[0]): for j in range(img1.shape[1]): if img1[i][j] > 200: count1 = count1 + 1 i = 0; j = 0; for i in range(img2.shape[0]): for j in range(img2.shape[1]): if img2[i][j] > 200: count2 = count2 + 1 return count1, count2 @guvectorize(['(float32[:,:],float32[:,:], int16)'], '(n,m), (n,m)-> ()', target = 'cuda') def cint(img1, img2, res): count1, count2 = pixel_count(img1, img2) res = count1 - count2 A = np.random.rand(480, 640).astype(np.float32)*255 B = np.random.rand(480, 640).astype(np.float32)*255 res1 = cint(A, B)

1条回答

网友

1楼 · 发布于 2024-09-29 01:37:54

不使用CUDA，但这可能会给您一些想法：

纯Numpy（已矢量化）：

A = np.random.rand(480, 640).astype(np.float32) * 255
B = np.random.rand(480, 640).astype(np.float32) * 255

%timeit (A > 200).sum() - (B > 200).sum()
478 µs ± 4.06 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

只需将numpy操作包装到JITted函数中：

@nb.njit
def pixel_count_jit(img):
    return (img > 200).sum()

%timeit pixel_count_jit(A) - pixel_count_jit(B)
165 µs ± 13.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

按行与Numba并行：

@nb.njit(parallel=True)
def pixel_count_parallel(img):
    counts = np.empty(img.shape[1], dtype=nb.uint32)
    for i in nb.prange(img.shape[0]):
        counts[i] = (img[i] > 200).sum()
    return counts.sum()

%timeit pixel_count_parallel(A) - pixel_count_parallel(B)
28.5 µs ± 571 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

相关问题更多 >

编程相关推荐

热门问题

热门文章