Tensorflow无法识别Ubuntu18.04上的GPU,CUDA9.1,CuDNN7.1,Python3.6,cond

2024-10-04 15:27:07 发布

您现在位置:Python中文网/ 问答频道 /正文

我在ubuntu18.04机器上的conda环境中运行python3.6,但是tensorflow无法识别我的GPU。在

lsb_释放-a的输出:

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.1 LTS
Release:        18.04
Codename:       bionic

来自英伟达smi的信息:

^{pr2}$

来自nvcc-V的信息

Cuda compilation tools, release 9.1, V9.1.85

nvidia debugdump-l的输出:

Found 1 NVIDIA devices
    Device ID:              0
    Device name:            Tesla P40
    GPU internal ID:        0322417022310

来自lspci-nnk | grep-i nvidia的输出

a5dd:00:00.0 3D controller [0302]: NVIDIA Corporation GP102GL [Tesla P40] [10de:1b38] (rev a1)
    Subsystem: NVIDIA Corporation GP102GL [Tesla P40] [10de:11d9]
    Kernel driver in use: nvidia
    Kernel modules: nvidia_drm, nvidia

conda的输出——版本

conda 4.5.12

echo$PATH的输出(不带空格):

/home/***/anaconda3/envs/tf_gpu/bin: /usr/local/cuda/bin: /home/***/.local/bin: /home/***/anaconda3/bin: /usr/local/sbin: /usr/local/bin: /usr/sbin: /usr/bin: /sbin: /bin: /usr/games: /usr/local/games: /snap/bin

echo$LD_LIBRARY_PATH的输出(不带空格):

/usr/local/cuda/lib64: /usr/local/cuda/extras/CUPTI/lib64: /usr/local/cuda/lib64: /usr/local/cuda/extras/CUPTI/lib64:

好的,我是这样安装我的env的:

^{9}$

这将安装以下软件包:

_tflow_select:       2.1.0-gpu
absl-py:             0.6.1-py36_0
astor:               0.7.1-py36_0
blas:                1.0-mkl
c-ares:              1.15.0-h7b6447c_1
ca-certificates:     2018.03.07-0
certifi:             2018.11.29-py36_0
cudatoolkit:         9.2-0
cudnn:               7.2.1-cuda9.2_0
cupti:               9.2.148-0
gast:                0.2.0-py36_0
grpcio:              1.16.1-py36hf8bcb03_1
h5py:                2.8.0-py36h989c5e5_3
hdf5:                1.10.2-hba1933b_1
intel-openmp:        2019.1-144
keras-applications:  1.0.6-py36_0
keras-preprocessing: 1.0.5-py36_0
libedit:             3.1.20170329-h6b74fdf_2
libffi:              3.2.1-hd88cf55_4
libgcc-ng:           8.2.0-hdf63c60_1
libgfortran-ng:      7.3.0-hdf63c60_0
libprotobuf:         3.6.1-hd408876_0
libstdcxx-ng:        8.2.0-hdf63c60_1
markdown:            3.0.1-py36_0
mkl:                 2019.1-144
mkl_fft:             1.0.6-py36hd81dba3_0
mkl_random:          1.0.2-py36hd81dba3_0
ncurses:             6.1-he6710b0_1
numpy:               1.15.4-py36h7e9f1db_0
numpy-base:          1.15.4-py36hde5b4d6_0
openssl:             1.1.1a-h7b6447c_0
pip:                 18.1-py36_0
protobuf:            3.6.1-py36he6710b0_0
python:              3.6.7-h0371630_0
readline:            7.0-h7b6447c_5
scipy:               1.1.0-py36h7c811a0_2
setuptools:          40.6.3-py36_0
six:                 1.12.0-py36_0
sqlite:              3.26.0-h7b6447c_0
tensorboard:         1.12.0-py36hf484d3e_0
tensorflow:          1.12.0-gpu_py36he74679b_0
tensorflow-base:     1.12.0-gpu_py36had579c0_0
tensorflow-gpu:      1.12.0-h0d30ee6_0
termcolor:           1.1.0-py36_1
tk:                  8.6.8-hbc83047_0
werkzeug:            0.14.1-py36_0
wheel:               0.32.3-py36_0
xz:                  5.2.4-h14c3975_4
zlib:                1.2.11-h7b6447c_3

然后在python控制台中检查tensorflow的可用设备:

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

打印这个:

2018-12-18 10:44:12.135984: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 14921140553341499580
 , name: "/device:XLA_CPU:0"
 device_type: "XLA_CPU"
 memory_limit: 17179869184
 locality {
 }
 incarnation: 17804082860482987174
 physical_device_desc: "device: XLA_CPU device"
]

你看不出来。有趣的是,如果我安装Pythorch,它就可以识别GPU了。现在,我尝试了我在其他帖子中看到的各种东西,比如从conda中删除protobuf包和tensorflowgpu,然后用pip重新安装它,但这并没有改变任何东西。在

如何让tensorflow识别GPU?非常感谢任何帮助!在

同样的问题也无助于解决我的问题:

对于CuDNN安装,我遵循以下指南(直到Bazel说明):


Tags: bingpudeviceusrlocaltensorflowcpuconda

热门问题