我正在尝试制作一个pycuda包装器,灵感来自scikits cuda库,用于Nvidia的新cuSolver库中提供的一些操作。我想用LU分解来求解AX=B形式的线性系统,首先使用scikits cuda中的cublasSgetrfBatched方法,它给了我因子分解的LU;然后通过这个分解,我想用cuSolve中的cusolverDnSgetrs来求解系统,当我执行计算返回状态3时,要给出的矩阵我的答案没有改变,但是*devInfo是零,在cusolver的文档中可以看到:
CUSOLVER_STATUS_INVALID_VALUE=An unsupported value or parameter was passed to the function (a negative vector size, for example).
libcusolver.cusolverDnSgetrs.restype=int
libcusolver.cusolverDnSgetrs.argtypes=[_types.handle,
ctypes.c_char,
ctypes.c_int,
ctypes.c_int,
ctypes.c_void_p,
ctypes.c_int,
ctypes.c_void_p,
ctypes.c_void_p,
ctypes.c_int,
ctypes.c_void_p]
"""
handle is the handle pointer given by calling cusolverDnCreate() from cuSolver
LU is the LU factoriced matrix given by cublasSgetrfBatched() from scikits
P is the pivots matrix given by cublasSgetrfBatched()
B is the right hand matix from AX=B
"""
def cusolverSolveLU(handle,LU,P,B):
rows_LU ,cols_LU = LU.shape
rows_B, cols_B = B.shape
B_gpu = gpuarray.to_gpu(B.astype('float32'))
info_gpu = gpuarray.zeros(1, np.int32)
status=libcusolver.cusolverDnSgetrs(
handle, 'n', rows_LU, cols_B,
int(LU.gpudata), cols_LU,
int(P.gpudata), int(B_gpu.gpudata),
cols_B, int(info_gpu.gpudata))
print info_gpu
print status
handle= cusolverCreate() #get the initialization of cusolver
LU, P = cublasLUFactorization(...)
B = np.asarray(np.random.rand(3, 3), np.float32)
cusolverSolveLU(handle,LU,P,B)
输出:
[0]
3
我做错什么了?在
以下是如何使用库的完整工作示例;使用numpy的内置解算器验证结果:
相关问题 更多 >
编程相关推荐