向数据帧添加带有if条件的列时出现“SyntaxError无效语法”

2024-10-05 18:46:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我想添加一个新列dfout['EXCHANGE_RATIO'],它的值(行)将仅在dfout['CURRENCY'] != 'EUR'时从另一个数据帧(dfc['EXCHANGE_RATIO'])获取。当dfout['CURRENCY'] != 'EUR'时,我在dfc['CURRENCY_SOURCE']中搜索该值,并在同一行中获取dfc['EXCHANGE_RATIO']的值

dfout看起来像这样:

                DATE_PROCESS  BOOKING_ID DEP_AIRPORT ARR_AIRPORT       DEPARTURE_DATE         ARRIVAL_DATE    PRICE CURRENCY
0    2013-04-19 16:04:13 UTC    76969972         AEL         DEL  2013-04-18 00:00:00                  NaN   409.04      EUR
1    2014-04-17 02:26:46 UTC    76888867         ARP         ZAL  2014-04-19 00:00:00                  NaN   280.70      EUR

dfc看起来像这样:

    CURRENCY_SOURCE CURRENCY_TARGET  EXCHANGE_RATIO
0               TRL             EUR    9.900000e-08
1               VES             EUR    3.220000e-07

我已经尝试了这两种方法,但都放弃了Syntax error: invalid syntax。为什么?

dfout['EXCHANGE_RATIO'] = dfout['CURRENCY'].apply(lambda x: dfc.query('CURRENCY_SOURCE'==x)['EXCHANGE_RATIO'] if x != 'EUR')

dfout['EXCHANGE_RATIO'] = dfout['CURRENCY'].apply(lambda x: dfc.loc[dfc['CURRENCY_SOURCE'] == x, 'EXCHANGE_RATIO'].iloc[-1] if x != 'EUR')

Tags: 数据lambdasourcedateifexchangenaneur
2条回答

您可以使用^{}方法:

dfout['EXCHANGE_RATIO'] = dfout['CURRENCY'] \
    .map(dict(zip(dfc['CURRENCY_SOURCE'], dfc['EXCHANGE_RATIO'])))

例如,像这样的dfout

              DATE_PROCESS  BOOKING_ID DEP_AIRPORT ARR_AIRPORT       DEPARTURE_DATE  ARRIVAL_DATE   PRICE CURRENCY
0  2013-04-19 16:04:13 UTC    76969972         AEL         DEL  2013-04-18 00:00:00           NaN  409.04      EUR
1  2014-04-17 02:26:46 UTC    76888867         ARP         ZAL  2014-04-19 00:00:00           NaN  280.70      EUR
2  2014-04-17 02:26:46 UTC    76888867         ARP         ZAL  2014-04-19 00:00:00           NaN  280.70      TRL
3  2014-04-17 02:26:46 UTC    76888867         ARP         ZAL  2014-04-19 00:00:00           NaN  280.70      VES

您将获得以下输出:

              DATE_PROCESS  BOOKING_ID DEP_AIRPORT ARR_AIRPORT       DEPARTURE_DATE  ARRIVAL_DATE   PRICE CURRENCY  EXCHANGE_RATIO
0  2013-04-19 16:04:13 UTC    76969972         AEL         DEL  2013-04-18 00:00:00           NaN  409.04      EUR             NaN
1  2014-04-17 02:26:46 UTC    76888867         ARP         ZAL  2014-04-19 00:00:00           NaN  280.70      EUR             NaN
2  2014-04-17 02:26:46 UTC    76888867         ARP         ZAL  2014-04-19 00:00:00           NaN  280.70      TRL    9.900000e-08
3  2014-04-17 02:26:46 UTC    76888867         ARP         ZAL  2014-04-19 00:00:00           NaN  280.70      VES    3.220000e-07

如果你想替换那些NaN,你可以使用fillna()

dfout['EXCHANGE_RATIO'] = dfout['CURRENCY'] \
    .map(dict(zip(dfc['CURRENCY_SOURCE'], dfc['EXCHANGE_RATIO']))) \
    .fillna(1)  # or whatever you want there

更正了query语法并添加了else以使代码正常工作。您必须使用@调用x

dfout['EXCHANGE_RATIO'] = dfout['CURRENCY'].apply(lambda x: dfc.query('CURRENCY_SOURCE==@x')['EXCHANGE_RATIO'][0] if x != 'EUR' else np.NaN)

输出-输入

                                DATE_PROCESS    BOOKING_ID  DEP_AIRPORT ARR_AIRPORT DEPARTURE_DATE  ARRIVAL_DATE    PRICE   CURRENCY
0   2013-04-19  16:04:13    UTC 76969972        AEL         DEL         2013-04-18  00:00:00        NaN             409.04  EUR
1   2014-04-17  02:26:46    UTC 76888867        ARP         ZAL         2014-04-19  00:00:00        NaN             280.70  EUR
2   2014-04-17  02:26:46    UTC 76888867        ARP         ZAL         2014-04-19  00:00:00        NaN             280.70  TRL

dfc

  CURRENCY_SOURCE CURRENCY_TARGET  EXCHANGE_RATIO
0             TRL             EUR    9.900000e-08
1             VES             EUR    3.220000e-07

输出

                                DATE_PROCESS    BOOKING_ID  DEP_AIRPORT ARR_AIRPORT DEPARTURE_DATE  ARRIVAL_DATE    PRICE   CURRENCY    EXCHANGE_RATIO
0   2013-04-19  16:04:13    UTC 76969972        AEL         DEL         2013-04-18  00:00:00        NaN             409.04  EUR         NaN
1   2014-04-17  02:26:46    UTC 76888867        ARP         ZAL         2014-04-19  00:00:00        NaN             280.70  EUR         NaN
2   2014-04-17  02:26:46    UTC 76888867        ARP         ZAL         2014-04-19  00:00:00        NaN             280.70  TRL         9.900000e-08

相关问题 更多 >