DataFrame:如何将字典列表从每一行拆分为单独的列?

2024-09-29 18:32:34 发布

您现在位置:Python中文网/ 问答频道 /正文

下面的DataFrame有两列,其中一列是字典列表中的字典列表。 我想把这一栏分成几栏

import pandas as pd

USERNAME = ['root', 'user1', 'user2','user3']
test_data = '[{"conjunction":"and","expressions":[{"_actualOperator":"contains","_actualValue":"LBD","attr":"displayName","op":"contains","value":"LBD"}],"name":"test_Event","editable":true}]'
test_data2 = '[{"conjunction":"and","expressions":[{"_actualOperator":"not_contains","_actualValue":"AAA","attr":"Event","op":"contains","value":"LBD"}],"name":"test_Event","editable":true}]'
test_data3 = '[{"conjunction":"and","expressions":[{"_actualOperator":"exclude","_actualValue":"BBB","attr":"Event","op":"contains","value":"LBD"}],"name":"test_Event","editable":true}]'
test_data4 = '[{"conjunction":"and","expressions":[{"_actualOperator":"adding","_actualValue":"CASA","attr":"displayName","op":"contains","value":"LBD"}],"name":"test_Event","editable":true}]'
VALUE_STRING = [test_data, test_data2, test_data3, test_data4]

data = {'USERNAME': ['root', 'user1', 'user2','user3'], 'VALUE_STRING' : VALUE_STRING}
df = pd.DataFrame(data)
df

USERNAME    VALUE_STRING
root        [{"conjunction":"and","expressions":[{"_actual...
user1       [{"conjunction":"and","expressions":[{"_actual...
user2       [{"conjunction":"and","expressions":[{"_actual...
user2       [{"conjunction":"and","expressions":[{"_actual...

我期待这样的结果:

df_expected = pd.DataFrame({'USERNAME': ['root', 'user1', 'user2','user3'], 
                            '_actualOperator':['contains','not_contains','exclude','adding'],
                           '_actualValue':['LBD','AAA','BBB','CASA'],
                           'attr':['displayName','Event','Event','displayName']})
df_expected

USERNAME    _actualOperator     _actualValue    attr
root        contains            LBD             displayName
user1       not_contains        AAA             Event
user2       exclude             BBB             Event
user3       adding              CASA            displayName

Tags: andtesteventusernamerootexpressionsattrcontains
1条回答
网友
1楼 · 发布于 2024-09-29 18:32:34

看起来VALUE_STRING列包含json数据,在这种情况下,我们可以使用json模块的loads方法解析json数据,然后从每一行提取与键expressions关联的字典,从这些字典中创建一个新的数据帧,并与USERNAME列连接

import json

s = df['VALUE_STRING'].map(json.loads)\
     .str[0].str['expressions'].str[0]

exp = pd.DataFrame([*s], index=s.index)
df_out = df[['USERNAME']].join(exp).drop(['op', 'value'], axis=1)

pandasjson_normalize方法的替代方法

s = df['VALUE_STRING'].map(json.loads).str[0]
exp = pd.json_normalize(s, 'expressions')
df_out = df[['USERNAME']].join(exp).drop(['op', 'value'], axis=1)

print(df_out)

  USERNAME _actualOperator _actualValue         attr
0     root        contains          LBD  displayName
1    user1    not_contains          AAA        Event
2    user2         exclude          BBB        Event
3    user3          adding         CASA  displayName

相关问题 更多 >

    热门问题