更新的cuda版本导致caffe内存错误

2024-10-01 02:39:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我用caffe gpu和cuda8创建了一个环境

conda create -n py27Cfe-gpu-p27h03f526a_2
conda install caffe-gpu=1.0=py27h03f526a_2

caffe-gpu                 1.0              py27h03f526a_2   
cudatoolkit               8.0                           3  
cudnn                     6.0.21                cuda8.0_0  
jupyter                   1.0.0                    py27_7  

通过选择在'康达安装caffe gpu的特定建设我得到cuda8。你知道吗

我还用cuda9创建了caffe gpu环境

conda create -n p27cu9Cfegpu
conda install caffe-gpu=1.0=py27heda4471_3

caffe-gpu                 1.0              py27heda4471_3
cudatoolkit               9.0                  h13b8566_0  
cudnn                     7.3.1                 cuda9.0_0
jupyter                   1.0.0                    py27_7

我用这两个测试了googledeepdreamjupyter笔记本。cuda8环境执行起来没有困难。CUDA9环境在这一层阻塞

I0505 12:29:44.577164  9839 net.cpp:744] Ignoring source layer loss2/loss
I0505 12:29:44.578850  9839 net.cpp:744] Ignoring source layer loss3/loss3
F0505 12:29:55.785749  9839 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***

我尝试将批处理大小更改为1部署.prototxt像这样归档:

name: "GoogleNet"
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 1 dim: 3 dim: 224 dim: 224 } }
}

但这没用。我意识到在这两个环境之间还有许多其他的变化,它们就在这里。你知道吗

other differences between the cuda9 environment and the cuda8 environment are:
(Cuda8 env lacks what has a minus but has what has a plus)

-backports_abc             0.5                      py27_0  
+backports_abc             0.5              py27h7b3c97b_0  

-caffe-gpu                 1.0              py27heda4471_3  
+caffe-gpu                 1.0              py27h03f526a_2  

-cudatoolkit               9.0                  h13b8566_0  
-cudnn                     7.3.1                 cuda9.0_0  
-cycler                    0.10.0                   py27_0  
+cudatoolkit               8.0                           3  
+cudnn                     6.0.21                cuda8.0_0  
+cycler                    0.10.0           py27hc7354d3_0  

-h5py                      2.7.1            py27h2697762_0  
+h5py                      2.8.0            py27h39dcb92_0  

-hdf5                      1.10.1               h9caa474_1  
+hdf5                      1.8.18               h6792536_1  

-ipython_genutils          0.2.0            py27h89fb69b_0  
+ipython_genutils          0.2.0                    py27_0  

-libprotobuf               3.5.2                h6f1eeef_0  
+libprotobuf               3.4.1                h5b8497f_0  

+linecache2                1.0.0                    py27_0  

-nbformat                  4.4.0            py27hed7f2b2_0  
+nbformat                  4.4.0                    py27_0  

-opencv                    3.3.1            py27hdcf4849_0  
+opencv                    3.3.1            py27h9bb06ff_1  

-protobuf                  3.5.2            py27hf484d3e_1  
+protobuf                  3.4.1            py27h2ba6a9c_0  

traitlets                 4.3.2                    py27_0  
-wcwidth                   0.1.7                    py27_0  
+traceback2                1.4.0                    py27_0  
+traitlets                 4.3.2            py27hd6ce930_0  
+unittest2                 1.1.0                    py27_0  
+wcwidth                   0.1.7            py27h9e3e1ab_0  

在每种情况下,脚本运行时都会出现另一个小错误,因此我认为这不是cuda9失败的原因

 Network initialization done.
I0505 12:29:44.542949  9839 upgrade_proto.cpp:53] Attempting to upgrade input file specified using deprecated V1LayerParameter: ./modelZoo/bvlc_googlenet/bvlc_googlenet.caffemodel
I0505 12:29:44.575798  9839 upgrade_proto.cpp:61] Successfully upgraded file specified using deprecated V1LayerParameter

有人能解释一下这种记忆状况吗?gpu是nvidia 1050Ti。ubuntu18.04安装了Nvidia的最新驱动程序

nvidia-smi
Sun May  5 12:44:44 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.39       Driver Version: 418.39       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 105...  On   | 00000000:01:00.0  On |                  N/A |
| 20%   32C    P5    N/A /  75W |    406MiB /  4038MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1746      G   /usr/lib/xorg/Xorg                            26MiB |
|    0      2296      G   /usr/bin/gnome-shell                          48MiB |
|    0      3226      G   /usr/lib/xorg/Xorg                           195MiB |
|    0      3358      G   /usr/bin/gnome-shell                         132MiB |
+-----------------------------------------------------------------------------+

Tags: gpu环境usrcondacppcaffedimpy27