尝试在dataframe中使用explode不会导致数据更改

2024-06-24 12:16:38 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图通过以下示例使用explode:

#creating a dataframe for example:

d = [{'A':3,'B':[{'id':'001'},{'id':'002'}]},
    {'A':4,'B':[{'id':'003'},{'id':'004'}]},
    {'A':5,'B':[{'id':'005'},{'id':'006'}]},
    {'A':6,'B':[{'id':'007'},{'id':'008'}]}]
df = pd.DataFrame(d)
df
    A   B
0   3   [{'id': '001'}, {'id': '002'}]
1   4   [{'id': '003'}, {'id': '004'}]
2   5   [{'id': '005'}, {'id': '006'}]
3   6   [{'id': '007'}, {'id': '008'}]
#apply an explode to the column B and reset index

df1 = df.explode('B')
df1.reset_index(drop = True, inplace = True)
df1

# now it looks like this
    A    B
0   3   {'id': '001'}
1   3   {'id': '002'}
2   4   {'id': '003'}
3   4   {'id': '004'}
4   5   {'id': '005'}
5   5   {'id': '006'}
6   6   {'id': '007'}
7   6   {'id': '008'}

我的数据看起来很相似:

msaid   tracts
0   159 [{"geoid":"02020000101"},{"geoid":"02020000204...
1   160 [{"geoid":"26091060100"},{"geoid":"26125138100...
2   161 [{"geoid":"01115040300"},{"geoid":"01015001700...
3   163 [{"geoid":"72054580100"},{"geoid":"72054580200...
4   162 [{"geoid":"55135100200"},{"geoid":"55135101200...

问题是,当我应用df.explode('tracts')时,数据帧中没有变化,我不知道为什么。如有任何建议,我们将不胜感激

以下是我对上述后者的代码:

df = pd.read_excel('parse this.xlsx')
df.head()
    msaid   tracts
0   159 [{"geoid":"02020000101"},{"geoid":"02020000204...
1   160 [{"geoid":"26091060100"},{"geoid":"26125138100...
2   161 [{"geoid":"01115040300"},{"geoid":"01015001700...
3   163 [{"geoid":"72054580100"},{"geoid":"72054580200...
4   162 [{"geoid":"55135100200"},{"geoid":"55135101200...

然后

df = df.explode('tracts')
df.head()
    msaid   tracts
0   159 [{"geoid":"02020000101"},{"geoid":"02020000204...
1   160 [{"geoid":"26091060100"},{"geoid":"26125138100...
2   161 [{"geoid":"01115040300"},{"geoid":"01015001700...
3   163 [{"geoid":"72054580100"},{"geoid":"72054580200...
4   162 [{"geoid":"55135100200"},{"geoid":"55135101200...









print(df.head(2).to_dict())
{'msaid': {0: 159, 1: 160}, 'tracts': {0: '[{"geoid":"02020000101"},{"geoid":"02020000204"},{"geoid":"02020000300"},{"geoid":"02020000400"},{"geoid":"02020000500"},{"geoid":"02020000600"},{"geoid":"02020000802"},{"geoid":"02020000901"},{"geoid":"02020000902"},{"geoid":"02020001000"},{"geoid":"02020001500"},{"geoid":"02020001601"},{"geoid":"02020001602"},{"geoid":"02020001701"},{"geoid":"02020001802"},{"geoid":"02020001900"},{"geoid":"02020002000"},{"geoid":"02020002100"},{"geoid":"02020002201"},{"geoid":"02020002400"},{"geoid":"02020002501"},{"geoid":"02020002502"},{"geoid":"02020002601"},{"geoid":"02020002712"},{"geoid":"02020002811"},{"geoid":"02020002812"},{"geoid":"02020002813"},{"geoid":"02122000100"},{"geoid":"02122000300"},{"geoid":"02170001300"},{"geoid":"02170000300"},{"geoid":"02170001100"},{"geoid":"02170000800"},{"geoid":"02261000300"},{"geoid":"02290000400"},{"geoid":"02240000400"},{"geoid":"02170000102"},{"geoid":"02170000402"},{"geoid":"02170000101"},{"geoid":"02170001201"},{"geoid":"02170001001"},{"geoid":"02170000706"},{"geoid":"02170001202"},{"geoid":"02170001004"},{"geoid":"02170000705"},{"geoid":"02170000603"},{"geoid":"02020000102"},{"geoid":"02020000201"},{"geoid":"02020000202"},{"geoid":"02020000203"},{"geoid":"02020000701"},{"geoid":"02020000702"},{"geoid":"02020000703"},{"geoid":"02020000801"},{"geoid":"02020001100"},{"geoid":"02020001200"},{"geoid":"02020001300"},{"geoid":"02020001400"},{"geoid":"02020001702"},{"geoid":"02020001731"},{"geoid":"02020001732"},{"geoid":"02020001801"},{"geoid":"02020002202"},{"geoid":"02020002301"},{"geoid":"02020002302"},{"geoid":"02020002303"},{"geoid":"02020002602"},{"geoid":"02020002603"},{"geoid":"02020002702"},{"geoid":"02020002711"},{"geoid":"02020002821"},{"geoid":"02020002822"},{"geoid":"02020002823"},{"geoid":"02020002900"},{"geoid":"02068000100"},{"geoid":"02170000200"},{"geoid":"02170000900"},{"geoid":"02261000100"},{"geoid":"02170000401"},{"geoid":"02170000502"},{"geoid":"02170000501"},{"geoid":"02170000604"},{"geoid":"02170000601"},{"geoid":"02170001003"},{"geoid":"02170000703"},{"geoid":"02170000701"}]', 1: '[{"geoid":"26091060100"},{"geoid":"26125138100"},{"geoid":"26163588300"},{"geoid":"26163588100"},{"geoid":"26163561900"},{"geoid":"26163589400"},{"geoid":"26115830600"},{"geoid":"26093744800"},{"geoid":"26093743800"},{"geoid":"26093732100"},{"geoid":"26093743700"},{"geoid":"26161400300"},{"geoid":"26161400400"},{"geoid":"26161400500"},{"geoid":"26161400600"},{"geoid":"26161402200"},{"geoid":"26161402300"},{"geoid":"26161403600"},{"geoid":"26161402500"},{"geoid":"26161403100"},{"geoid":"26161403200"},{"geoid":"26161455000"},{"geoid":"26161403300"},{"geoid":"26161404300"},{"geoid":"26161404400"},{"geoid":"26161404500"},{"geoid":"26161404600"},{"geoid":"26161405500"},{"geoid":"26161406000"},{"geoid":"26161414200"},{"geoid":"26161416000"},{"geoid":"26161432000"},{"geoid":"26161445000"},{"geoid":"26161453000"},{"geoid":"26161448000"},{"geoid":"26161456000"},{"geoid":"26161405600"},{"geoid":"26161414000"},{"geoid":"26161414500"},{"geoid":"26161461000"},{"geoid":"26161407600"},{"geoid":"26161454000"},{"geoid":"26161410200"},{"geoid":"26161410800"},{"geoid":"26161410900"},{"geoid":"26161411000"},{"geoid":"26161411100"},{"geoid":"26161416200"},{"geoid":"26161410500"},{"geoid":"26161412100"},{"geoid":"26161412300"},{"geoid":"26161411700"},{"geoid":"26161414700"},{"geoid":"26161423400"},{"geoid":"26161415800"},{"geoid":"26161421100"},{"geoid":"26093733601"},{"geoid":"26161984000"},{"geoid":"26161412000"},{"geoid":"26161411900"},{"geoid":"26161446200"},{"geoid":"26163564504"},{"geoid":"26091060301"},{"geoid":"26091060302"},{"geoid":"26163561700"},{"geoid":"26163588200"},{"geoid":"26115830700"},{"geoid":"26075006801"},{"geoid":"26093744900"},{"geoid":"26093743900"},{"geoid":"26093744600"},{"geoid":"26161400100"},{"geoid":"26161400200"},{"geoid":"26161444000"},{"geoid":"26161400700"},{"geoid":"26161400800"},{"geoid":"26161402100"},{"geoid":"26161402600"},{"geoid":"26161402700"},{"geoid":"26161407400"},{"geoid":"26161466000"},{"geoid":"26161403800"},{"geoid":"26161403400"},{"geoid":"26161403500"},{"geoid":"26161404100"},{"geoid":"26161404200"},{"geoid":"26161405100"},{"geoid":"26161405200"},{"geoid":"26161405300"},{"geoid":"26161405400"},{"geoid":"26161415200"},{"geoid":"26161422200"},{"geoid":"26161407000"},{"geoid":"26161420000"},{"geoid":"26161420200"},{"geoid":"26161426000"},{"geoid":"26161423600"},{"geoid":"26161431000"},{"geoid":"26161421900"},{"geoid":"26161414300"},{"geoid":"26161465000"},{"geoid":"26161422900"},{"geoid":"26161464000"},{"geoid":"26161410300"},{"geoid":"26161410400"},{"geoid":"26161410600"},{"geoid":"26161410700"},{"geoid":"26161415400"},{"geoid":"26161411200"},{"geoid":"26161412700"},{"geoid":"26161413200"},{"geoid":"26161410100"},{"geoid":"26161414900"},{"geoid":"26161413000"},{"geoid":"26161412600"},{"geoid":"26161415600"},{"geoid":"26161425000"},{"geoid":"26161447000"},{"geoid":"26161413403"},{"geoid":"26161446400"},{"geoid":"26161413401"},{"geoid":"26161413402"},{"geoid":"26075006804"},{"geoid":"26075006803"},{"geoid":"26163564402"},{"geoid":"26163564501"},{"geoid":"26163561200"},{"geoid":"26163564401"},{"geoid":"26091062400"}]'}}

print(type(df['tracts'][0])) 
<class 'str'>

您可以下载原始数据here


Tags: to数据idtruedfindexthishead
2条回答

使用ast模块将字符串转换为列表对象,然后使用explode

Ex:

import ast

data = [{'A':3,'B':"[{'id':'001'},{'id':'002'}]"},
    {'A':4,'B':"[{'id':'003'},{'id':'004'}]"},
    {'A':5,'B':"[{'id':'005'},{'id':'006'}]"},
    {'A':6,'B':"[{'id':'007'},{'id':'008'}]"}]

df = pd.DataFrame(data)
df["B"] = df['B'].apply(ast.literal_eval)
df1 = df.explode('B')
df1.reset_index(drop = True, inplace = True)
print(df1)

输出:

   A              B
0  3  {'id': '001'}
1  3  {'id': '002'}
2  4  {'id': '003'}
3  4  {'id': '004'}
4  5  {'id': '005'}
5  5  {'id': '006'}
6  6  {'id': '007'}
7  6  {'id': '008'}

您需要将类型更改为list,然后可以使用explode

df=df.assign(**df['tracts'].apply(eval)).explode('tracts')

相关问题 更多 >