<p>我想我找到了一个更容易解决问题的方法:</p>
<p>从一个空字典开始,我将df_base的所有键插入其中,如下所示:</p>
<pre><code> product_keys = {}
product_keys = df_base['product_key'].drop_duplicates().reset_index(inplace=False, drop=True).to_dict()
</code></pre>
<p>生成的字典将类似于:</p>
<pre><code> {0: 2,
1: 1,
2: 31
}
</code></pre>
<p>使用df.apply()完成此步骤后,我可以迭代dataframe的每一行,使用刚刚创建的字典的键更改产品键的行值:</p>
<pre><code> df_product_final['Product'] = df_base['product_key']
df_product_final.apply(
self.keys_from_value,
dict = product_keys,
axis='columns',
raw = False,
result_type='broadcast',
)
</code></pre>
<p>self.keys_from_值:</p>
<pre><code> def keys_from_value(self, row, dict):
if row is None:
return
else:
row['Product'] = list(dict.keys())[list(dict.values()).index(row['Product'])]
return row
</code></pre>
<p>最后一步是计算并在数据帧内插入正确的SpeedAvg(这很容易:第一个循环是基于刚刚修改的行获取列组id;第二个循环是基于组id插入SpeedAvg):</p>
<pre><code> gid = 0
for i, row in df_base.iterrows():
if row['diff'] != 0:
gid += 1
df_base.at[i,'group_id'] = gid
avg = df_product_final["Speed"].groupby(df_base['group_id']).mean()
#avg is a Pandas Series of all the SpeedAvg based on their position relative to #the list
for i, row in df_product_final.iterrows():
for row_avg in avg.index.values.tolist():
if row.at['group_id'] == row_avg:
df_product_final.at[i,'SpeedAvg'] = avg[row_avg]
</code></pre>
<p>这是经过以下步骤后生成的数据帧(df_product_final):</p>
<pre><code> Product Speed SpeedAvg
newindex
2020-10-20 09:00:00+00:00 0 0.000000 0.000000
2020-10-20 09:00:00+00:00 1 0.000000 104.528338
2020-10-20 10:00:00+00:00 1 0.000000 104.528338
2020-10-20 11:00:00+00:00 1 0.000000 104.528338
2020-10-20 12:00:00+00:00 1 68.375000 104.528338
2020-10-20 13:00:00+00:00 1 188.074074 104.528338
2020-10-20 14:00:00+00:00 1 172.192982 104.528338
2020-10-20 15:00:00+00:00 1 162.553571 104.528338
2020-10-20 16:00:00+00:00 1 178.867925 104.528338
2020-10-20 17:00:00+00:00 1 181.844828 104.528338
2020-10-20 18:00:00+00:00 1 93.375000 104.528338
2020-10-19 20:00:00+00:00 0 0.000000 0.000000
2020-10-19 21:00:00+00:00 0 0.000000 0.000000
2020-10-19 22:00:00+00:00 0 0.000000 0.000000
2020-10-19 23:00:00+00:00 0 0.000000 0.000000
2020-10-20 00:00:00+00:00 0 0.000000 0.000000
2020-10-20 01:00:00+00:00 0 0.000000 0.000000
2020-10-20 02:00:00+00:00 0 0.000000 0.000000
2020-10-20 03:00:00+00:00 0 0.000000 0.000000
2020-10-20 04:00:00+00:00 0 0.000000 0.000000
2020-10-20 05:00:00+00:00 0 0.000000 0.000000
2020-10-20 06:00:00+00:00 0 0.000000 0.000000
2020-10-20 07:00:00+00:00 0 0.000000 0.000000
2020-10-20 08:00:00+00:00 0 0.000000 0.000000
2020-10-20 09:00:00+00:00 2 0.000000 95.025762
2020-10-20 10:00:00+00:00 2 0.000000 95.025762
2020-10-20 11:00:00+00:00 2 0.000000 95.025762
2020-10-20 12:00:00+00:00 2 68.375000 95.025762
2020-10-20 13:00:00+00:00 2 188.074074 95.025762
2020-10-20 14:00:00+00:00 2 172.192982 95.025762
2020-10-20 15:00:00+00:00 2 162.553571 95.025762
2020-10-20 16:00:00+00:00 2 178.867925 95.025762
2020-10-20 17:00:00+00:00 2 181.844828 95.025762
2020-10-20 18:00:00+00:00 2 93.375000 95.025762
2020-10-20 19:00:00+00:00 2 0.000000 95.025762
</code></pre>