仅使用2个内核时性能更差

2024-06-14 03:45:58 发布

您现在位置:Python中文网/ 问答频道 /正文

我制作了一个程序,在一定范围内计算Armstrong数,问题是性能比仅使用两个处理器时的串行实现差

这是序列号:

import sys
import random
from time import perf_counter as pc
import datetime
import time
ARRAYSIZE = int(sys.argv[1])


numbers = [i for i in range(1, ARRAYSIZE+1)]
random.shuffle(numbers)

start_time = time.time()
armstrong = []
for i in numbers:
    num = i
    result = 0
    n = len(str(i))
    while(i != 0):
        digit = i % 10
        result += digit**n
        i //= 10
    if num == result:
        armstrong.append(num)
armstrong.sort()
elapsed_time = (time.time() - start_time)
print(f"Serial time (Shuffle): {elapsed_time}")

基本上,它会检查数组中的每个数字是否为阿姆斯特朗数字

这是并行代码:

from mpi4py import MPI
import random
import sys
import timeit

comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()
MASTER = 0
WORKERS = size - 1
ARRAYSIZE = int(sys.argv[1])

CHUNKSIZE = ARRAYSIZE//WORKERS


if rank == MASTER:
    master_time = timeit.default_timer()
    array = []
    for i in range(ARRAYSIZE):
        array.append(i+1)
    random.shuffle(array)
    for i in range(WORKERS):
        sub_array = []
        for j in range(CHUNKSIZE):
            sub_array.append(array[j])
        comm.send(sub_array, dest=(i+1), tag=1)
        array = array[CHUNKSIZE:len(array)]
    armstrong = []
    for i in array:
        num = i
        result = 0
        n = len(str(num))
        while(i != 0):
            digit = i % 10
            result += digit**n
            i //= 10
        if num == result:
            armstrong.extend(num)
    for i in range(WORKERS):
        get_armstrong_numbers = comm.recv(tag=2)
        armstrong.extend(get_armstrong_numbers)
    armstrong.sort()
    print(f'Master Time (Shuffle): {timeit.default_timer() - master_time}')


elif rank != MASTER:
    worker_time = timeit.default_timer()
    receive = comm.recv(source=0, tag=1)
    armstrong_numbers = []
    for i in range(CHUNKSIZE):
        num = receive[i]
        result = 0
        n = len(str(num))
        while(receive[i] != 0):
            digit = receive[i] % 10
            result += digit**n
            receive[i] //= 10
        if num == result:
            armstrong_numbers.append(num)
    comm.send(armstrong_numbers, dest=0, tag=2)
    print(f'Worker time (Shuffle): {timeit.default_timer() - worker_time}')

并行程序所做的是将初始数组分成若干部分,分配给不同的工作人员,并将一个部分分配给主程序。完成后,工人将他们的零件发送给主控台,主控台将合并、排序和打印结果

以下是一些例子:(我只是复制并粘贴了我的终端)

PS C:\Users\danil\Desktop\Armstrong Numbers\2 Cores - Intermediate Times> python .\serial_shuffle.py 10000000    
Serial time (Shuffle): 27.026000261306763
PS C:\Users\danil\Desktop\Armstrong Numbers\2 Cores - Intermediate Times> mpiexec -n 2 python .\Shuffle.py 10000000
Master Time (Shuffle): 39.5854112
PS C:\Users\danil\Desktop\Armstrong Numbers\2 Cores - Intermediate Times> mpiexec -n 3 python .\Shuffle.py 10000000
Worker time (Shuffle): 23.1734209
Worker time (Shuffle): 25.0732174
Master Time (Shuffle): 25.073847100000002
Worker time (Shuffle): 17.7299418
Worker time (Shuffle): 19.547413000000002
Worker time (Shuffle): 20.8562403
Master Time (Shuffle): 20.8567713
PS C:\Users\danil\Desktop\Armstrong Numbers\2 Cores - Intermediate Times> mpiexec -n 5 python .\Shuffle.py 10000000
Worker time (Shuffle): 14.8727631
Worker time (Shuffle): 16.3137905
Worker time (Shuffle): 17.071894099999998
Worker time (Shuffle): 18.0240944
Master Time (Shuffle): 18.0242909
PS C:\Users\danil\Desktop\Armstrong Numbers\2 Cores - Intermediate Times> mpiexec -n 6 python .\Shuffle.py 10000000
Worker time (Shuffle): 13.0210344
Worker time (Shuffle): 14.223855599999998
Worker time (Shuffle): 15.3327299
Worker time (Shuffle): 16.0194109
Worker time (Shuffle): 16.786164199999998
Master Time (Shuffle): 16.785847299999997

有几点:

  1. 排序的复杂性:不相关,如果我不洗牌(因此)不对数组排序,我会有相同的行为
  2. 如您所见,这只发生在-n2
  3. 所有类型的数字都有相同的行为:从1000到250000000
  4. 我知道I/O操作(如打印)是并行化的阻碍因素
  5. 我用同样的方法检查阿姆斯特朗号码

Tags: inimportfortimerangeresultarraynum