使用Cython构建长度未知的一维数组/列表/向量的最有效方法？还是永远不要这样做？

from libc.math cimport log def main(some args): cdef (some vars) cdef list OutputList = [] # NB: all vars have declared types for x in range(t): (do some Cythonic stuff, some of which uses my cimport-ed log) if condition is True: OutputList.append(x) # this is the only 'yellow' line in my main loop. return OutputList # return Python object to Python script that calls main()

3条回答

网友

1楼 · 编辑于 2024-10-05 12:13:54

在CPython中，追加python列表是一个很好的优化操作。Python不为每个元素分配内存，而是递增地增加指向列表中对象的指针数组。所以换成Cython对你没什么帮助。

您可以在Cython中使用c++容器，如下所示：

from libc.math cimport log
from libcpp.list cimport list as cpplist

def main(int t):

    cdef cpplist[int] temp

    for x in range(t):
        if x> 0:
            temp.push_back(x)

    cdef int N = temp.size()
    cdef list OutputList = N*[0]

    for i in range(N):
        OutputList[i] = temp.front()
        temp.pop_front()

    return OutputList

你必须测试一下这是否会加快速度，但也许你不会获得太多的速度。

另一种方法是使用numpy数组。在这里，Cython非常擅长优化代码。因此，如果您可以使用numpy数组作为main的返回值，那么您应该考虑这样做，并用一些Cython代码分配和填充numpy数组来替换OutputList的构造和填充。

有关详细信息，请参见http://docs.cython.org/src/tutorial/numpy.html

问问你是否需要帮助。

更新：如果避免在两个循环中查找方法，则代码应该快一点：

from libc.math cimport log
from libcpp.list cimport list as cpplist

def main(int t):

    cdef cpplist[int] temp

    push_back = temp.push_back
    for x in range(t):
        if x> 0:
            push_back(x)

    cdef int N = temp.size()
    cdef list OutputList = N*[0]

    front = temp.front()
    pop_front = temp.pop_front()
    for i in range(N):
        OutputList[i] = front()
        pop_front()

    return OutputList

网友

2楼 · 编辑于 2024-10-05 12:13:54

您可以做的是计算有多少元素符合您的条件，然后为这些元素分配一个足够大的numpy数组。

# pseudo code
def main(): 
   count = 0
   for i in range(t):
       if criteria: 
            count += 1

   cdef numpy.ndarray[count] result

   int idx =0
   for i in range(t):
      if criteria:
          idx += 1
          result[idx] = value

网友

3楼 · 编辑于 2024-10-05 12:13:54

build1darray.pyx：

指定types for index variables
关闭safety checks
可以uses multiple cpus（对于大型t和v.size()非常有用）

#cython: boundscheck=False, wraparound=False
from libc.math cimport log

from cython.parallel cimport prange

import numpy as pynp
cimport numpy as np

# copy declarations from libcpp.vector to allow nogil
cdef extern from "<vector>" namespace "std":
    cdef cppclass vector[T]:
        void push_back(T&) nogil
        size_t size()
        T& operator[](size_t)

def makearray(int t):
    cdef vector[np.float_t] v
    cdef int i
    with nogil: 
        for i in range(t):
            if i % 10 == 0:
                v.push_back(log(i+1))

    cdef np.ndarray[np.float_t] a = pynp.empty(v.size(), dtype=pynp.float)
    for i in prange(a.shape[0], nogil=True):
        a[i] = v[i]
    return a

第二部分是第一个循环的~1%，因此在这种情况下，对其速度进行优化是没有意义的。

<math.h>在我的系统上有extern "C" { ... }所以libc.math.log工作。

可以使用PyArray_SimpleNewFromData()来避免复制数据，从而为数组管理内存。

build1darray.pyx：

相关问题更多 >

编程相关推荐

热门问题

热门文章