特征矩阵与Numpy阵列乘法性能

import numpy as np import time n_a_rows = 4000 n_a_cols = 3000 n_b_rows = n_a_cols n_b_cols = 200 a = np.arange(n_a_rows * n_a_cols).reshape(n_a_rows, n_a_cols) b = np.arange(n_b_rows * n_b_cols).reshape(n_b_rows, n_b_cols) start = time.time() d = np.dot(a, b) end = time.time() print "time taken : {}".format(end - start)

#include <iostream> #include <Eigen/Dense> using namespace Eigen; int main() { int n_a_rows = 4000; int n_a_cols = 3000; int n_b_rows = n_a_cols; int n_b_cols = 200; MatrixXi a(n_a_rows, n_a_cols); for (int i = 0; i < n_a_rows; ++ i) for (int j = 0; j < n_a_cols; ++ j) a (i, j) = n_a_cols * i + j; MatrixXi b (n_b_rows, n_b_cols); for (int i = 0; i < n_b_rows; ++ i) for (int j = 0; j < n_b_cols; ++ j) b (i, j) = n_b_cols * i + j; MatrixXi d (n_a_rows, n_b_cols); clock_t begin = clock(); d = a * b; clock_t end = clock(); double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC; std::cout << "Time taken : " << elapsed_secs << std::endl; }

2条回答

网友

1楼 · 编辑于 2024-06-02 22:25:54

我的问题已经被@Jitse Niesen和@ggael在评论中回答了。

我需要添加一个标志来在编译时启用优化：-O2 -DNDEBUG（O是大写的O，而不是零）。

包含此标志后，eigen代码将在0.6秒内运行，而不是在~29秒内运行。

网友

2楼 · 编辑于 2024-06-02 22:25:54

更改：

a = np.arange(n_a_rows * n_a_cols).reshape(n_a_rows, n_a_cols)
b = np.arange(n_b_rows * n_b_cols).reshape(n_b_rows, n_b_cols)

进入：

a = np.arange(n_a_rows * n_a_cols).reshape(n_a_rows, n_a_cols)*1.0
b = np.arange(n_b_rows * n_b_cols).reshape(n_b_rows, n_b_cols)*1.0

这至少让我的笔记本电脑有了100倍的提升：

time taken : 11.1231250763

对：

time taken : 0.124922037125

除非你真的想乘整数。在Eigen中，乘法双精度数也更快（相当于用MatrixXd替换MatrixXi三次），但这里我只看到1.5个因子：所用时间：0.555005 vs 0.846788。

相关问题更多 >

编程相关推荐

热门问题

热门文章