我在谷歌计算引擎虚拟机上使用我的代码有点困难
我正在尝试运行一个小的FlaskAPI来检测图像中的表。 初始化检测器模型是可行的,但当我尝试检测表时,会出现以下错误:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.5/dist-packages/flask/_compat.py", line 39, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "ElvyCascadeTabNetAPI.py", line 36, in detect_tables
result = inference_detector(model, "temp.jpg")
File "/SingleModelTest/src/mmdet/mmdet/apis/inference.py", line 86, in inference_detector
result = model(return_loss=False, rescale=True, **data)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/SingleModelTest/src/mmdet/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/SingleModelTest/src/mmdet/mmdet/models/detectors/base.py", line 149, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/SingleModelTest/src/mmdet/mmdet/models/detectors/base.py", line 130, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/SingleModelTest/src/mmdet/mmdet/models/detectors/cascade_rcnn.py", line 342, in simple_test
x[:len(bbox_roi_extractor.featmap_strides)], rois)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/SingleModelTest/src/mmdet/mmdet/core/fp16/decorators.py", line 127, in new_func
return old_func(*args, **kwargs)
File "/SingleModelTest/src/mmdet/mmdet/models/roi_extractors/single_level.py", line 105, in forward
roi_feats_t = self.roi_layers[i](feats[i], rois_)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/SingleModelTest/src/mmdet/mmdet/ops/roi_align/roi_align.py", line 144, in forward
self.sample_num, self.aligned)
File "/SingleModelTest/src/mmdet/mmdet/ops/roi_align/roi_align.py", line 36, in forward
spatial_scale, sample_num, output)
RuntimeError: cuda runtime error (48) : no kernel image is available for execution on the device at mmdet/ops/roi_a
lign/src/roi_align_kernel.cu:139
当我搜索可能的解决方案时,我遇到了两个stackoverflow问题,问题是一个不受支持的旧gpu,因此我将google计算引擎VM上的gpu从Nvidia Tesla K80更改为Nvidia Tesla T4。K80的cuda计算能力为3.7,而新的T4的计算能力为7.5,因此我认为这可以解决问题,但事实并非如此
输出nvidia-smi
:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |
| N/A 72C P8 12W / 70W | 106MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 918 G /usr/lib/xorg/Xorg 95MiB |
| 0 N/A N/A 974 G /usr/bin/gnome-shell 9MiB |
+-----------------------------------------------------------------------------+
nvcc --version
:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
火炬版本:1.4.0+cu100
火炬视觉版0.5.0+cu100
我正在docker容器中运行API,Dockerfile:
# Dockerfile
FROM nvidia/cuda:10.0-devel
RUN nvidia-smi
RUN set -xe \
&& apt-get update \
&& apt-get install python3-pip -y \
&& apt-get install git -y \
&& apt-get install libgl1-mesa-glx -y
RUN pip3 install --upgrade pip
WORKDIR /SingleModelTest
COPY requirements /SingleModelTest/requirements
RUN export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64
RUN pip3 install -r requirements/requirements1.txt
RUN pip3 install -r requirements/requirements2.txt
COPY . /SingleModelTest
ENTRYPOINT ["python3"]
CMD ["TabNetAPI.py"]
编辑:
我被nvidia-smi
的输出搞糊涂了,因为cuda版本比我安装的版本高,但根据:https://medium.com/@brianhourigan/if-different-cuda-versions-are-shown-by-nvcc-and-nvidia-smi-its-necessarily-not-a-problem-and-311eda26856c的说法,这是正常的
如果有人有解决办法,我将非常感激。 如果我需要提供更多的信息,我很乐意
先谢谢你
目前没有回答
相关问题 更多 >
编程相关推荐