正在从数组中创建一个数组中的numpy泄漏

2024-10-03 09:20:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我已经将程序中的内存泄漏跟踪到我用C编写的Python模块,以便高效地解析ASCII-hex表示的数组。(例如“FF 39 00 FC…”)

char* buf;
unsigned short bytesPerTable;
if (!PyArg_ParseTuple(args, "sH", &buf, &bytesPerTable))
{
    return NULL;
}

unsigned short rowSize = bytesPerTable;
char* CArray = malloc(rowSize * sizeof(char));

// Populate CArray with data parsed from buf
ascii_buf_to_table(buf, bytesPerTable, rowSize, CArray);

int dims[1] = {rowSize};

PyObject* pythonArray = PyArray_SimpleNewFromData(1, (npy_intp*)dims, NPY_INT8, (void*)CArray);
return Py_BuildValue("(O)", pythonArray);

我意识到numpy不知道释放分配给CArray的内存,从而导致内存泄漏。在对这个问题进行了一些研究之后,根据this article中的注释的建议,我添加了下面的一行,它应该告诉数组它“拥有”它的数据,并在它被删除时释放它。在

^{pr2}$

但我还是有记忆泄露。我做错什么了?如何使NPY_ARRAY_OWNDATA标志正常工作?在

作为参考,ndarraytypes.h中的文档似乎可以这样做:

/*
 * If set, the array owns the data: it will be free'd when the array
 * is deleted.
 *
 * This flag may be tested for in PyArray_FLAGS(arr).
 */
#define NPY_ARRAY_OWNDATA         0x0004

下面的代码(调用C中定义的Python函数)演示了内存泄漏。在

tableData = "FF 39 00 FC FD 37 FF FF F9 38 FE FF F1 39 FE FC \n" \
            "EF 38 FF FE 47 40 00 FB 3D 3B 00 FE 41 3D 00 FE \n" \
            "43 3E 00 FF 42 3C FE 02 3C 40 FD 02 31 40 FE FF \n" \
            "2E 3E FF FE 24 3D FF FE 15 3E 00 FC 0D 3C 01 FA \n" \
            "02 3E 01 FE 01 3E 00 FF F7 3F FF FB F4 3F FF FB \n" \
            "F1 3D FE 00 F4 3D FE 00 F9 3E FE FC FE 3E FD FE \n" \
            "F6 3E FE 02 03 3E 00 FE 04 3E 00 FC 0B 3D 00 FD \n" \
            "09 3A 00 01 03 3D 00 FD FB 3B FE FB FD 3E FD FF \n"

for i in xrange(1000000):
    PES = ParseTable(tableData, 128, 4) //Causes memory usage to skyrocket

Tags: the内存fb数组fcfffdchar
1条回答
网友
1楼 · 发布于 2024-10-03 09:20:50

可能是引用计数问题(来自How to extend NumPy):

One common source of reference-count errors is the Py_BuildValue function. Pay careful attention to the difference between the ‘N’ format character and the ‘O’ format character. If you create a new object in your subroutine (such as an output array), and you are passing it back in a tuple of return values, then you should most- likely use the ‘N’ format character in Py_BuildValue. The ‘O’ character will increase the reference count by one. This will leave the caller with two reference counts for a brand-new array. When the variable is deleted and the reference count decremented by one, there will still be that extra reference count, and the array will never be deallocated. You will have a reference-counting induced memory leak. Using the ‘N’ character will avoid this situation as it will return to the caller an object (inside the tuple) with a single reference count.

相关问题 更多 >