大尺寸内存问题的Python点积

2024-10-03 02:42:33 发布

您现在位置:Python中文网/ 问答频道 /正文

我有矩阵X的维数(36000,3600) 我试图计算Sigma=X*X\u转置,作为实现PCA算法的一部分

X是连续的

直接尝试和第二维度上的循环,都以内存问题结束(我的计算机内存不足,运行8GB)

实现这一目标的最佳方法是什么? 附加我的代码

我是python新手(但不熟悉编程),这是我尝试做的第一件事,所以欢迎任何提示!你知道吗

多谢了

class PCAProjector:

def __init__(self, X):
    d, n = X.shape
    self.X = X
    self.Sigma = np.zeros((d,d), dtype='float32', order='C')
    for i in range(n):
        print(i)
        x_i = np.zeros((d,1), dtype='float32', order='C')
        x_i = x_i + X[:,[i]]
        x_i_transpose = np.zeros((1,d), dtype='float32', order='C')
        x_i_transpose = x_i_transpose + np.transpose(x_i)
        i_result = np.dot(x_i, x_i_transpose)
        self.Sigma = self.Sigma + i_result

    self.Sigma = self.Sigma / n
    self.EigenVectorsSorted = EigenVecValculator().calEigenVectors(self.Sigma)

def projectAllSamples(self, numOfDimensions):
    """
    Projects samples to numOfDimensions dimensions

    Projects samples that were passed in Ctor (self.X)
    :param numOfDimensions: number of dimensions to project on
    :return: matrix of (numOfDimensions)X(numOfSamples) that contains the projected samples
    """
    transformationMatrix = self.EigenVectorsSorted[:, 0:numOfDimensions]
    return np.transpose(transformationMatrix)*self.X

Tags: inselfdefnpzerosorderresultsigma