多CPU上的Python多处理，GPU

from torch.multiprocessing import Pool, Process, set_start_method try: set_start_method('spawn', force=True) except RuntimeError: pass model = load_model(device='cuda:' + gpu_id) def pooling_func(file): preds = [] cap = cv2.VideoCapture(file) while(cap.isOpened()): ret, frame = cap.read() count += 1 if ret == True: frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) pred = model(frame)[0] preds.append(pred) else: break cap.release() np.save(file[:-4]+'.npy', preds) def process_files(): # all files to process on gpu_id files = np.load(gpu_id + '_files.npy') # I am hoping to use 6 cores for this gpu_id, # and a different 6 cores for a different GPU id pool = Pool(6) r = list(tqdm(pool.imap(pooling_func, files), total = len(files))) pool.close() pool.join() if __name__ == '__main__': import multiprocessing multiprocessing.freeze_support() process_files()

1条回答

网友

1楼 · 发布于 2024-09-09 13:11:59

以下是对question you asked的最初答复，但后来被删除

考虑到这一点，如果你没有使用^ {CD1>}标志，那么所有的GPU都可以用于你的Py火炬进程。这意味着torch.cuda.device_count将返回8（假设您的版本设置有效）。您将能够通过^{}、通过torch.device('cuda:0')、torch.device('cuda:1')、torch.device('cuda:8')访问这8个GPU中的每一个

现在，如果您只打算使用一个，并且希望将您的流程限制为一个。然后CUDA_VISIBLE_DEVICES=i（其中i是设备序号）将使其成为这样。在这种情况下torch.cuda只能通过torch.device('cuda:0')访问单个设备。不管实际的设备顺序是什么，访问它的方式都是通过torch.device('cuda:0')

如果您允许访问多个设备：比如n°0、n°4和n°2，那么您将使用CUDA_VISIBLE_DEVICES=0,4,2。因此，您通过d0 = torch.device('cuda:0')、d1 = torch.device('cuda:1')和d2 = torch.device('cuda:2')引用cuda设备。与您使用标志定义它们的顺序相同，，即：

d0 -> GPU n°0, d1 -> GPU n°4, and d2 -> GPU n°2.

这使得您可以使用相同的代码并在不同的GPU上运行它，而无需更改引用设备序号的底层代码

总之，您需要查看的是运行代码所需的设备数量。在您的情况下：1就足够了。您将使用torch.device('cuda:0')引用它。但是，在运行代码时，需要使用以下标志指定cuda:0设备是什么：

> CUDA_VISIBLE_DEVICES=0 inference.py
> CUDA_VISIBLE_DEVICES=1 inference.py
  ...
> CUDA_VISIBLE_DEVICES=7 inference.py

注意'cuda'将默认为'cuda:0'

相关问题更多 >

编程相关推荐

热门问题

热门文章