<p>谢谢你的帮助,在让它完全工作时遇到了问题。通过进一步搜索,我最终使用了intersection()和difference()来分割索引。最终代码如下</p>
<p>我还尝试在reset_index和set_index上使用for循环,但它不会停留在for循环之外,我应该在这里做什么来修复它</p>
<pre><code>DF_Set_Index_List = [report_same_df, report_eod_df, report_intra_df, endofday_same_df, intraday_same_df, endofday_only_df, intraday_only_df]
for df_sil in DF_Set_Index_List:
print(df_sil,'\n\n\n')
df_sil = df_sil.set_index(df_default_index)
print('INDEX: \n', df_sil.index)
print('\n\n', 'Outside Loop')
print(report_same_df)
print('\n\n')
print(report_same_df.index)
</code></pre>
<p>关于以下最终代码、清理、更有效的方法等的任何反馈</p>
<pre><code>#!/usr/bin/python
import pandas as pd
from sys import exit
Invoices = {'EvaluatePoint': ['EndOfDay', 'EndOfDay', 'EndOfDay', 'EndOfDay', 'EndOfDay', 'EndOfDay', 'IntraDay', 'IntraDay', 'IntraDay', 'IntraDay', 'IntraDay'],
'EvaluateDate': ['08/06/2021','08/06/2021','08/06/2021','08/06/2021','08/06/2021','08/06/2021','08/06/2021','08/06/2021','08/06/2021','08/06/2021','08/06/2021'],
'InvoiceNumber': [123697, 123697, 123697, 123698, 123699, 123699, 123696, 123697, 123697, 123697, 123698],
'InvoiceItem': [0,1,2,0,0,1,0,0,1,2,0],
'Cost': [-3569,-3745,-3921,-4097,-4273,-4449,-4625,-3569,-3745,-4678,-5329],
'Proceeds': [7000,7569,8138,8707,9276,9845,10414,7000,7569,8138,12690],
'NetAmount': [3431,3824,4217,4610,5003,5396,5789,3431,3824,3460,7361]
}
df = pd.DataFrame(Invoices, columns = ['EvaluatePoint', 'EvaluateDate','InvoiceNumber','InvoiceItem','Cost','Proceeds','NetAmount'])
endofday_df = df[df.EvaluatePoint == 'EndOfDay']
intraday_df = df[df.EvaluatePoint == 'IntraDay']
# Set Default Index to compare DataFrames
df_default_index = ['EvaluateDate','InvoiceNumber','InvoiceItem']
endofday_df = endofday_df.set_index(df_default_index)
intraday_df = intraday_df.set_index(df_default_index)
# Create var where index exists in both EndOfDay and IntraDay
idx_same = endofday_df.index.intersection(intraday_df.index)
# Create var where index exists only in EndOfDay or IntraDay
idx_eod_only = endofday_df.index.difference(intraday_df.index)
idx_intra_only = intraday_df.index.difference(endofday_df.index)
#create DataFrames where index is the same between EndOfDay and IntraDay
endofday_same_df = endofday_df.loc[idx_same]
intraday_same_df = intraday_df.loc[idx_same]
#create DataFrames where index is the only in EndOfDay or IntraDay
endofday_only_df = endofday_df.loc[idx_eod_only]
intraday_only_df = intraday_df.loc[idx_intra_only]
# Reset Index so we have only raw data
endofday_same_df = endofday_same_df.reset_index()
intraday_same_df = intraday_same_df.reset_index()
endofday_only_df = endofday_only_df.reset_index()
intraday_only_df = intraday_only_df.reset_index()
# Create Base Report
report_same_df = endofday_same_df[['EvaluateDate','InvoiceNumber','InvoiceItem']]
report_eod_df = endofday_only_df[['EvaluateDate','InvoiceNumber','InvoiceItem']]
report_intra_df = intraday_only_df[['EvaluateDate','InvoiceNumber','InvoiceItem']]
report_same_df = report_same_df.set_index(df_default_index)
report_eod_df = report_eod_df.set_index(df_default_index)
report_intra_df = report_intra_df.set_index(df_default_index)
endofday_same_df = endofday_same_df.set_index(df_default_index)
intraday_same_df = intraday_same_df.set_index(df_default_index)
endofday_only_df = endofday_only_df.set_index(df_default_index)
intraday_only_df = intraday_only_df.set_index(df_default_index)
DiffColumns = ['Cost','Proceeds','NetAmount']
for col in DiffColumns:
report_same_df['DiffOf' + str(col)] = endofday_same_df[col] - intraday_same_df[col]
report_eod_df['DiffOf' + str(col)] = endofday_only_df[col]
report_intra_df['DiffOf' + str(col)] = -intraday_only_df[col]
report_df = pd.concat([report_same_df, report_eod_df, report_intra_df])
print(report_df)
report_df.to_csv("DiffReport.csv", index=True, header=True)
</code></pre>