如何避免python中的for循环?

2024-10-02 20:31:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图生成partners的随机样本,在100次迭代中没有重复的位置。合作伙伴及其所在地位于df。在每次迭代结束时,我想知道随机分配的每个合作伙伴的份额和评级,以及之前的数据old_df

# Data
import pandas as pd
old_df = pd.DataFrame({'location': ['Hyderabad', 'Assam', 'Kolkata'], 
                      'partner':['x','y','z'],
                      'share':[0.1,0.4,0.2],
                      'ratings':[20,20,10]})
df = pd.DataFrame({'location': ['Bangalore', 'Bangalore', 'Mumbai','Mumbai','Mumbai','Pune','Pune','Pune','Chennai','Chennai'], 
                      'partner':['x','y','z','y','z','x','y','z','z','x'],
                      'share':[0.1,0.1,0.4,0.4,0.4,0.2,0.2,0.2,0.1,0.1],
                      'ratings':[20,10,10,20,30,20,20,20,10,20]})
# Simulation
simulations = 100
all_stats = []
start_time = time.time()
for num in range(simulations):
    random_sample = df.sample(frac = 1.0).groupby('location').head(1)
    random_sample = old_df.append(random_sample[['location','partner','share','ratings']])
    condn = random_sample.groupby(['partner']).sum().reset_index()
    condn = condn[['partner','share','ratings']]
    all_stats.append([num,
                      condn.share[0].round(2),
                      condn.share[1].round(2),
                      condn.share[2].round(2),
                      random_sample['ratings'].sum().round(0)])
    all_stats

print("--- %s seconds ---" % (round(time.time() - start_time,3)))

100次迭代需要1.4seconds。当我放大(更多数据)时,我想运行更多的迭代,有没有更快的方法来实现这一点


Tags: samplesharedfpartnertimelocationrandomall