批次大小和每批次推断时间之间的线性关系

2024-05-21 05:36:46 发布

男 | 程序猿一只，喜欢编程写python代码。

我做了以下我无法理解的观察。我正在使用torchvision的deeplabv3_resnet50模型，并以不同批量的评估模式运行它。包含torch.synchronize（）的运行时几乎与批大小成线性关系。这意味着每秒图像的速率几乎是恒定的。我还尝试了cudnn.benchnmark=True/False的不同设置。我希望在所有批量大小的情况下，每个批都能有一个恒定的推断时间，因为它们在gpu上并行运行。我有什么不对劲吗

2-0.02s

4-0.031s

8-0.05秒

16-0.094s

32-0.178s

相关代码：

           with torch.no_grad():
                inputs, side_data = self.create_input(sample) # inputs.to(device) happens in here
                torch.cuda.synchronize()
                labels = sample["label"]
                start = time.time()
                outputs = self.model(inputs)
                torch.cuda.synchronize()
                print(time.time() - start)
                preds = torch.argmax(outputs, 1)

谢谢你的帮助。马库斯

torch=1.4.0，CUDA版本：10.0，特斯拉V100

Tags： sample 模型 self synchronize time 模式 torch 批量

0条回答

目前没有回答

批次大小和每批次推断时间之间的线性关系

相关问题更多 >

编程相关推荐

热门问题

热门文章

批次大小和每批次推断时间之间的线性关系

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >