比较两个不同的pandas数据帧和删除行Python

2024-10-01 11:22:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我正纠结于以下问题。我有两个数据帧df1和df2,希望通过列transportation来比较这些数据帧,然后从df1中选择country并drom每个国家定义的日期,如下代码所示。如果执行此操作,我将收到以下错误消息:

ValueError: Can only compare identically-labeled Series objects

代码如下所示:

from pandas.tseries.holiday import (
AbstractHolidayCalendar, EasterMonday,
GoodFriday, Holiday, next_monday,
Easter, nearest_workday, Day, USMartinLutherKingJr,
USPresidentsDay, USMemorialDay, USLaborDay,
USThanksgivingDay)

class GermanHoliday(AbstractHolidayCalendar):
    rules = [
             Holiday('New Years Day', month=1, day=1, observance=next_monday),
             GoodFriday,
             EasterMonday,
             Holiday('Reformation Day', year=2017, month=10, day=31, observance=nearest_workday),
             Holiday('Labour Day', month=5, day=1, observance=nearest_workday),
             Holiday('Whit Monday', month=1, day=1, offset=[Easter(), Day(50)]),
             Holiday('Day of German Unity', month=10, day=3, observance=nearest_workday),
             Holiday('Christmas Day', month=12, day=25, observance=nearest_workday),
             Holiday('Boxing Day',month=12, day=26, observance=nearest_workday) 
    ]

class USHolidays(AbstractHolidayCalendar):
    rules = [
             Holiday('NewYearsDay', month=1, day=1, observance=nearest_workday),
             USMartinLutherKingJr,
             USPresidentsDay,
             GoodFriday,
             USMemorialDay,
             Holiday('USIndependenceDay', month=7, day=4, observance=nearest_workday),
             USLaborDay,
             USThanksgivingDay,
             Holiday('Christmas Day', month=12, day=25, observance=nearest_workday)
             ]

calendarGermany = GermanHoliday()
calendarUS = USHolidays()

holidaysGermany = calendarGermany .holidays().to_pydatetime()
holidaysUS = calendarUS .holidays().to_pydatetime()

qry = "Transportation in @df1.ticker and Date not in @holidaysGermany "

df2 = df2.query(qry)

数据帧df1和df2的结构如下:

df1:

^{pr2}$

df2:

   Date         transportation price
0  2015-12-21   ICE            81.9924
1  2015-12-22   ICE            81.5173
2  2015-12-23   ICE            83.5015
3  2015-12-24   ICE            83.5015
4  2015-12-25   ICE            83.5015
5  2015-12-28   ICE            83.0357
6  2015-12-29   ICE            84.6286
7  2015-12-30   ICE            83.7250
8  2015-12-31   ICE            83.7250
9  2016-01-01   ICE            83.7250
10 2015-12-21   National       127.3900
11 2015-12-22   National       129.0000
12 2015-12-23   National       131.8800
13 2015-12-24   National       131.8800
14 2015-12-25   National       131.8800
15 2015-12-28   National       130.0300
16 2015-12-29   National       132.1700
...

最终结果如下:

df2:

   Date         transportation price
0  2015-12-21   ICE            81.9924
1  2015-12-22   ICE            81.5173
2  2015-12-23   ICE            83.5015
3  2015-12-24   ICE            83.5015
4  2015-12-28   ICE            83.0357
5  2015-12-29   ICE            84.6286
6  2015-12-30   ICE            83.7250
7  2015-12-31   ICE            83.7250
8  2016-01-01   ICE            83.7250
9  2015-12-21   National       127.3900
10 2015-12-22   National       129.0000
11 2015-12-23   National       131.8800
12 2015-12-26   National       131.8800
13 2015-12-28   National       130.0300
14 2015-12-29   National       132.1700
...

Tags: 数据df1df2holidaydayicenationaltransportation
1条回答
网友
1楼 · 发布于 2024-10-01 11:22:27

你可以这样做:

In [197]: qry = "transportation in @df1.transportation and \
     ...:        Date not in ['2015-12-24','2015-12-25']"

In [198]: df2.query(qry)
Out[198]:
         Date transportation     price
0  2015-12-21            ICE   81.9924
1  2015-12-22            ICE   81.5173
2  2015-12-23            ICE   83.5015
5  2015-12-28            ICE   83.0357
6  2015-12-29            ICE   84.6286
7  2015-12-30            ICE   83.7250
8  2015-12-31            ICE   83.7250
9  2016-01-01            ICE   83.7250
10 2015-12-21       National  127.3900
11 2015-12-22       National  129.0000
12 2015-12-23       National  131.8800
15 2015-12-28       National  130.0300
16 2015-12-29       National  132.1700

相关问题 更多 >