<p>我开发了Pandas的apply功能和定制Pandas的groupby功能</p>
<p>(定制熊猫团购积分:<a href="https://medium.com/@sean.turner026/applying-custom-functions-to-groupby-objects-in-pandas-61af58955569" rel="nofollow noreferrer">https://medium.com/@sean.turner026/applying-custom-functions-to-groupby-objects-in-pandas-61af58955569</a>)</p>
<p>我还修改了你的输入,以显示一些可能的结果。你知道吗</p>
<p>代码如下所示</p>
<pre><code># defined the table copied from your question
table = """
leid run_seq cp_id products currency amount
101 1 201 A YEN 345
102 1 201 A IDR 900
102 2 201 B INR 223
101 2 202 A USD 845
102 3 201 C USD 345
"""
import pandas as pd
import numpy as np
with open("stackoverflow.csv", "w") as f:
f.write(table)
df = pd.read_csv("stackoverflow.csv", delim_whitespace=True)
df = df.sort_values(by = ["leid", "run_seq"]).reset_index(drop = True)
# assigned using pandas apply in axis = 1
df["current"] = df.apply(lambda x: {x["cp_id"]: {x["products"]: {x["currency"]: x["amount"]}}}, axis = 1)
# defining a function to merge dictionaries
def Merge(dict1, dict2):
res = {**dict1, **dict2}
return res
# defining a customised cumulative function dictionary
def cumsumdict(data):
current_dict = [{}]
for i in range(1, data.shape[0]):
cp_id = list(data["current"].iloc[i-1])[0]
product = list(data["current"].iloc[i-1][cp_id])[0]
currency = list(data["current"].iloc[i-1][cp_id][product])[0]
if cp_id in current_dict[-1]:
# merge cp_id using dictionary merge if exist in previous trx
cp_merger = Merge(current_dict[-1][cp_id], data["current"].iloc[i-1][cp_id])
appender = current_dict[-1]
appender[cp_id] = cp_merger
if product in current_dict[-1][cp_id]:
# merge products using dictionary merge if exist in previous trx
product_merger = Merge(current_dict[-1][cp_id][product], data["current"].iloc[i-1][cp_id][product])
appender = current_dict[-1]
appender[cp_id][product] = product_merger
if currency in current_dict[-1][cp_id][product]:
# sum the currency value
currency_merger = current_dict[-1][cp_id][product][currency] + data["current"].iloc[i-1][cp_id][product][currency]
appender = current_dict[-1]
appender[cp_id][product][currency] = currency_merger
else:
appender = Merge(current_dict[-1], data["current"].iloc[i-1])
current_dict.append(appender)
data["history"] = current_dict
return data
df = df.groupby(["leid"]).apply(cumsumdict)
df = df[["leid", "run_seq", "current", "history"]]
print(df)
</code></pre>
<p>上述功能将导致</p>
<pre><code> leid run_seq current \
0 101 1 {201: {'A': {'YEN': 345}}}
3 101 2 {202: {'A': {'USD': 845}}}
1 102 1 {201: {'A': {'IDR': 900}}}
2 102 2 {201: {'B': {'INR': 223}}}
4 102 3 {201: {'C': {'USD': 345}}}
history
0 {}
3 {201: {'A': {'YEN': 345}}}
1 {}
2 {201: {'A': {'IDR': 900}, 'B': {'INR': 446}}}
4 {201: {'A': {'IDR': 900}, 'B': {'INR': 446}}}
</code></pre>