SLURM和python多处理会产生不一致的结果

sinfo -N -l Mon Oct 3 08:58:12 2016 NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT FEATURES REASON dlab-node1 1 dlab* idle 64 2:16:2 257847 0 1 (null) none dlab-node2 1 dlab* idle 64 2:16:2 257847 0 1 (null) none dlab-node3 1 dlab* idle 64 2:16:2 257847 0 1 (null) none dlab-node4 1 dlab* idle 64 2:16:2 257847 0 1 (null) none

#!/bin/bash # #SBATCH -p dlab # partition (queue) #SBATCH -N 2 # number of nodes #SBATCH -n 64 # number of cores #SBATCH --mem 250 # memory pool for all cores #SBATCH -t 0-2:00 # time (D-HH:MM) #SBATCH -o slurm.%N.%j.out # STDOUT #SBATCH -e slurm.%N.%j.err # STDERR python3 asd.py

1条回答

网友

1楼 · 发布于 2024-09-29 00:12:39

从sbatch手册页：

SLURM_JOB_CPUS_PER_NODE
Count of processors available to the job on this node. Note the select/linear plugin allocates entire nodes to jobs, so the value indicates the total count of CPUs on the node. The select/cons_res plugin allocates individual processors to jobs, so this number indicates the number of processors on this node allocated to the job

正如突出显示的那样，变量将只返回在运行脚本的节点中分配的cpu数量。如果您想要一个同构的分配，您应该指定 ntasks-per-node=32

另外，请记住，多处理不会在多个节点中生成进程。如果你想跨越多个节点，你有一个很好的文档here

相关问题更多 >

编程相关推荐

热门问题

热门文章