当我连接两个没有NaN的数据帧(多级索引)时,为什么会得到NaN?

2024-10-06 12:23:12 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个具有多级索引r1和r2的数据帧,这样

a1=['iso3_o', 'iso3_d', 'year', 'ExportFoodAndLiveAnimals']
a=np.array([['CAN', 'USA', '1995.0', '5918210.506'],
       ['CAN', 'USA', '1996.0', '6988508.727'],
       ['CAN', 'USA', '1997.0', '7792977.258'],
       ['CAN', 'USA', '1998.0', '8177456.631'],
       ['CAN', 'USA', '1999.0', '8773990.755'],
       ['CAN', 'USA', '2000.0', '9650783.071'],
       ['CAN', 'USA', '2001.0', '10800432.88'],
       ['CAN', 'USA', '2002.0', '11348837.38'],
       ['CAN', 'USA', '2003.0', '11313334.46'],
       ['CAN', 'USA', '2004.0', '12337588.35'],
       ['CAN', 'USA', '2005.0', '13227226.96'],
       ['CAN', 'USA', '2006.0', '14236699.34'],
       ['CAN', 'USA', '2007.0', '15638919.3'],
       ['CAN', 'USA', '2008.0', '17449901.08'],
       ['CAN', 'USA', '2009.0', '14813089.89'],
       ['CAN', 'USA', '2010.0', '16399733.82']])
r1 = pd.DataFrame(a, columns=a1)
r1

r2定义为

^{pr2}$

然后我决定加入他们的多索引级别

因此,我所做的是将列重置为索引

 multi_r2 = r2.set_index(['iso3_o', 'iso3_d','year'])
    multi_r1 = r1.set_index(['iso3_o', 'iso3_d','year'])
    df = multi_r2.join(multi_r1)

当我在“iso3”“o”,“iso3”“d”,“year”上加入时,数据帧df给了我一个NAN

为什么会这样?在

提前谢谢你


Tags: 数据dfindexa1nparrayyearmulti
2条回答

我的问题看起来很简单,但我想我想和你分享一下。基本上正如EdChum指出的那样,我必须改变年份的数据类型,我已经完成了一系列步骤。也许有一个更简单的方法,但我不知道如果你是请分享。在

提取值并将它们保存在numpy数组中

import scipy
a=r1.values
C = scipy.delete(a, 2, 1)

为year变量创建一个数字,并将其与新数组连接起来

^{pr2}$

提取前一个数组r1的列,并重新对该数组进行采样,使年份在最末尾

cols=list(r1)
cols
cols.insert(len(cols)-1, cols.pop(cols.index('year')))
cols

将数据帧r1重新创建为

r1=pd.DataFrame(C1,columns= cols)
r1

然后按我之前做的步骤做

multi_r2 = r2.set_index(['iso3_o', 'iso3_d','year'])
multi_r1 = r1.set_index(['iso3_o', 'iso3_d','year'])
df = multi_r2.join(multi_r1)

这对我来说很好

r1r2中的year列都是str,但不一样,将其更改为int即可

r1['year'] = [int(float(i)) for i in r1['year']]
r2['year'] = [int(i) for i in r2['year']]
multi_r1 = r1.set_index(['iso3_o', 'iso3_d','year'])
multi_r2 = r2.set_index(['iso3_o', 'iso3_d','year'])
df = multi_r2.join(multi_r1)

相关问题 更多 >