我有一段代码,以前在python
中使用multiprocess
进行了并行处理,它工作正常(尽管速度慢且内存不足)。我决定试着把它转换成cython
。我对{gil
代码依赖于外部C库https://github.com/astro-informatics/ssht/(在README
中的编译指令),该库在引擎盖下使用fftw
。这个库有自己的cython
文件,它调用我正在使用的同一个c
函数(ssht_core_mw_inverse_sov_sym_ss
)。与我非常相似的函数(在该repo的cython
文件中)如下所示
def ssht_inverse_mwss_complex(
np.ndarray[ double complex, ndim=1, mode="c"] f_lm not None,
int L,
int spin
):
cdef ssht_dl_method_t dl_method = SSHT_DL_RISBO
f_mwss_c = np.empty([L+1,2*L,], dtype=complex)
ssht_core_mw_inverse_sov_sym_ss(
<double complex*> np.PyArray_DATA(f_mwss_c),
<const double complex*> np.PyArray_DATA(f_lm),
L,
spin,
dl_method,
0
)
return f_mwss_c
我基本上必须在本地重新创建它,因为我需要它,而不需要gil
当我使用cython模块运行脚本时,我得到一个分段错误,但每次的错误都略有不同。要么说明没有错误,要么说明指针内存分配:
malloc: *** error for object 0x7f83de8b58e0: pointer being freed was not allocated
python(21855,0x70000d333000) malloc: Double free of object 0x7f83de8b58e0
python(21855,0x70000d536000) malloc: *** set a breakpoint in malloc_error_break to debug
或者它似乎是特定于FFTW
:
fftw: /Users/runner/.conan/data/fftw/3.3.8/_/_/build/55f3919d9a41efc78a625ee65e5d1ea60d02b2ff/source_subfolder/kernel/planner.c:261: assertion failed: SLVNDX(slot) == slvndx
环顾四周,我发现了这种问题https://github.com/bytedeco/javacpp-presets/issues/435,所以我希望fftw
意味着我试图做的事情是不可能的(而且我更不擅长于c
)
我试过使用来自libc.stdlib
的free
,但这不起作用。我还尝试使用cython.view
数组创建数组,但很难使它们double complex
(这是ssht
库所需要的)。我试图让cython
调试工作,但在我的mac上运行时遇到问题。我也花了两天的时间把我的头撞在墙上
我以通常的方式编译扩展名python setup.py build_ext --inplace
。我正在使用python3.8.5
,Cython==0.29.21
。我正在运行macOS 11.0.1
我的cython文件:
import numpy as np
from libc.stdio cimport printf
from libc.stdlib cimport calloc, malloc
from cython.parallel import parallel, prange
from openmp cimport omp_get_thread_num
# needed to recreate without importing (for nogil)
cdef extern from "ssht/ssht.h" nogil:
ctypedef enum ssht_dl_method_t:
SSHT_DL_RISBO, SSHT_DL_TRAPANI
void ssht_core_mw_inverse_sov_sym_ss(
double complex *f,
const double complex *flm,
int L,
int spin,
ssht_dl_method_t dl_method,
int verbosity
)
def my_cython_module(int L, int threads):
"""
dummy function more to show that parallel loops fails
"""
cdef int ell, tid
with nogil, parallel(num_threads=threads):
tid = omp_get_thread_num()
for ell in prange(L * L, schedule="guided"):
printf("ell: %i\n", ell)
_ssht_inverse(L, ell)
cdef double complex * _ssht_inverse(int L, int ind) nogil:
"""
function creates a 1D complex array flm with zeros and a 1
then calls c function to get 2D complex array f
not returning anything as it's just for demonstration
"""
cdef ssht_dl_method_t dl_method = SSHT_DL_RISBO
cdef double complex *flm = NULL
cdef double complex *f = NULL
flm = <double complex *> calloc(L * L, sizeof(double complex))
flm[ind] = 1
f = <double complex *> malloc((L + 1) * (2 * L) * sizeof(double complex))
ssht_core_mw_inverse_sov_sym_ss(f, flm, L, 0, dl_method, 0)
return f
我的setup.py
:
import os
from Cython.Build import cythonize
from setuptools import Extension, setup
# running on mac so need GCC instead of clang
os.environ["CC"] = "gcc-10"
setup(
ext_modules=cythonize(
Extension(
"test",
["*.pyx"],
extra_compile_args=["-fopenmp"],
extra_link_args=["-fopenmp"],
include_dirs=["/usr/local/include"],
),
annotate=True,
language_level=3,
compiler_directives=dict(boundscheck=False, embedsignature=True),
),
)
以下工作(pip install pyssht
)并成功地并行工作。所以问题似乎出在c
/cython
# the cython wrapper from the external library
from pyssht import ssht_inverse_mwss_complex
import numpy as np
from multiprocess import Pool
def my_python_implementation(L, threads):
"""
the python equivalent in parallel
"""
def func(chunk):
"""
deals with each chunk
"""
for ell in chunk:
print(f"ell: {ell}")
flm = np.zeros(L * L, dtype=np.complex_)
flm[ell] = 1
ssht_inverse_mwss_complex(flm, L, 0)
chunks = np.array_split(np.arange(L * L), threads)
with Pool(processes=threads) as p:
p.map(func, chunks)
鉴于我能够在python中并行运行它,我真的希望它能够实现
正如@DavidW所指出的,我之所以运行这些问题,是因为
FFTW
不能以多线程方式运行(但在python中可以使用multiprocessing
)。这个问题与我使用的依赖于FFTW
的外部代码有关。我提出了一个问题,看看是否可以强制FFTW
位为单线程https://github.com/astro-informatics/ssht/issues/44相关问题 更多 >
编程相关推荐