<p>我一直致力于让一个依赖TensorFlow的应用程序作为带<code>nvidia-docker</code>的docker容器工作。我已经在<code>tensorflow/tensorflow:latest-gpu-py3</code>图像上编译了我的应用程序。我使用以下命令运行docker容器:</p>
<p><code>sudo nvidia-docker run -d -p 9090:9090 -v /src/weights:/weights myname/myrepo:mylabel</code></p>
<p>当通过<code>portainer</code>查看日志时,我看到以下内容:</p>
<pre class="lang-sh prettyprint-override"><code>2017-05-16 03:41:47.715682: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-16 03:41:47.715896: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-16 03:41:47.715948: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-05-16 03:41:47.715978: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-16 03:41:47.716002: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-05-16 03:41:47.718076: E tensorflow/stream_executor/cuda/cuda_driver.cc:405] failed call to cuInit: CUDA_ERROR_UNKNOWN
2017-05-16 03:41:47.718177: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: 1e22bdaf82f1
2017-05-16 03:41:47.718216: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: 1e22bdaf82f1
2017-05-16 03:41:47.718298: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 367.57.0
2017-05-16 03:41:47.718398: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:369] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 367.57 Mon Oct 3 20:37:01 PDT 2016
GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3)
"""
2017-05-16 03:41:47.718455: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 367.57.0
2017-05-16 03:41:47.718484: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 367.57.0
</code></pre>
<p>容器似乎启动正常,而我的应用程序似乎正在运行。当我向它发送预测请求时,预测会被正确地返回——但是在CPU上运行推断时,我会以我所期望的缓慢速度返回,所以我认为很明显,GPU不是出于某种原因被使用的。我也试过在同一个容器中运行<code>nvidia-smi</code>,以确保它看到我的GPU,以下是结果:</p>
<pre class="lang-sh prettyprint-override"><code>+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57 Driver Version: 367.57 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GRID K1 Off | 0000:00:07.0 Off | N/A |
| N/A 28C P8 7W / 31W | 25MiB / 4036MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
</code></pre>
<p>我当然不是这方面的专家-但似乎从容器内部可以看到GPU。关于如何让TensorFlow发挥作用有什么想法吗?</p>