如何提高神经网络的性能？

import numpy as np class ManyAssociations: def fit(x_train, y_train, learning_rate, tol): L_L = x_train.shape[1] L_N = y_train.shape[1] W = np.zeros((L_N, L_L)) for n in range(L_N): learning = True w = np.random.rand(L_L) while learning: delta = (x_train @ w - y_train[:,n]) grad_E = delta @ x_train w = w - learning_rate * grad_E if (grad_E @ grad_E) < tol: W[n] = w learning = False ManyAssociations.weights = W def predict(x_pred, W): preds = [] for k in range(x_pred.shape[0]): preds.append(W @ x_pred[k]) return np.array(preds)

1条回答

网友
1楼 · 发布于 2024-07-05 14:31:40

I discovered cupy, but cupy is much, much slower than numpy in this case. Why would this be?
GPU上的计算分为基本的计算密集型构建块，称为内核。内核由CPU提交到GPU每个内核调用都需要一些时间：CPU必须与GPU通信，并且经常使用相对较慢的PCI互连（两者都应同步），应在GPU上进行分配，以便写入生成的数据，等等。CuPy包天真地将每个基本Numpy指令转换为GPU内核由于循环执行许多小内核，因此开销很大。因此，如果您希望使用CuPy在GPU上更快地编写代码，您需要处理大量数据块，或者直接编写自己的内核（这很难，因为GPU非常复杂）
Is there any way to use njit inside of the class or apart from that is there a better way to use numba/cuda?
您可以使用@jitclass。您可以在documentation中找到更多信息
此外，您可以利用并行性加快代码编写速度。为此，您可以用prange替换range，并将属性parallel=True添加到Numba的njit。您可以找到更多信息here
What are some other good ways to speed up code like the below one? Could I be writing more efficiently? Could I be using the GPU better?
神经网络通常是计算密集型的。Numba应该相当好，以获得相当高的性能。但是，如果您想要一个快速的代码，那么您要么需要使用更高级别的库，要么需要自己重写库所做的事情（可能是使用更低级别的代码）。使用神经网络的标准方法是使用专用库，如TensorFlow、PyTorch、Keras等。另外，前者是灵活的，高度优化的，尽管它比另一个稍微低一点

相关问题更多 >

编程相关推荐

热门问题

热门文章