为什么在Cython中写入C数组这么慢？

cdef class Differential: cdef int *SX cdef int *X cdef int nmax def __init__(self, int nmax): self.nmax = nmax ## usually around 10*1000 return def __cinit__(self, int nmax, *arg, **args): self.SX = <float *>malloc(nmax*cython.sizeof(float)) ## assume self.X has some content. self.X = <float *>malloc(nmax*cython.sizeof(float)) return def __dealloc__(self): free(self.SX) free(self.X) return @cython.wraparound(False) @cython.boundscheck(False) @cython.nonecheck(False) @cython.cdivision(True) cdef void __reject(self, float step) nogil: cdef unsigned int v cdef unsigned int k cdef double x cdef double dx float_array_init(self.SX,1000,0.) ## writes 0. to the 100000 first elements for v in range(1000): x = self.X[v] for k in range(v+1,1000): dx = x-self.X[k] # the following line is the "problem": self.SX[k] -= dx ## some more code # manipulate SX some more. this section has less performance impact because it # is not a double for-loop, so i have not included it in the example # update X for v in range(1000): self.X[v] += self.SX[v] def reject(self, float step): self.__reject(step)

1条回答

网友
1楼 · 发布于 2024-09-30 22:19:33

我发现了一个可以解释这种缓慢的问题，请注意，您创建x和dx作为double来接收float值，方法是更改为：
cdef float x cdef float dx
我得到了2倍的加速，因为它避免了在x = self.X[v]中将浮点值转换为双精度，然后在self.SX[k] -= dx中将双精度转换为浮点值。你知道吗
似乎您的方法没有丢失缓存，我测试了使用一个数组来存储self.X和self.SX的值，方法是通过2*i+0或2*i+1（0用于self.X和1用于self.SX来控制访问，时间是相同的。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章