如何将JSON SList转换为数据帧?

2024-09-28 22:23:23 发布

您现在位置:Python中文网/ 问答频道 /正文

a = ['{"type": "book",', 
     '"title": "sometitle",', 
     '"author": [{"name": "somename"}],', 
     '"year": "2000",', 
     '"identifier": [{"type": "ISBN", "id": "1234567890"}],', 
     '"publisher": "somepublisher"}', '',
     '{"type": "book",', '
     '"title": "sometitle2",', 
     '"author": [{"name": "somename2"}],', 
     '"year": "2001",', 
     '"identifier": [{"type": "ISBN", "id": "1234567890"}],', 
     '"publisher": "somepublisher"}', '']

我有这个复杂的SList,我希望最终将它放入一个整洁的数据框架中

我尝试了很多方法,例如:

i = iter(a)
b = dict(zip(i, i))

不幸的是,这创建了一个看起来更糟的字典:

{'{"type": "book",':
...

在我有一份词典的地方,我现在有一本词典

我也试过了

pd.json_normalize(a)

但这会抛出一条错误消息AttributeError: 'str' object has no attribute 'values'

我也试过了

r = json.dumps(a.l)
loaded_r = json.loads(r)
print(loaded_r)

但这会产生一个列表

['{"type": "book",',
...

最后,我希望有一个像这样的熊猫数据帧

type   title       author     year ...

book   sometitle   somename   2000 ...
book   sometitle2 somename2   2001

显然,我还没有真正达到可以将数据提供给函数的程度。每次我这么做的时候,函数都会向我尖叫


Tags: 数据nameidjsontitletypeyearpublisher
1条回答
网友
1楼 · 发布于 2024-09-28 22:23:23
a = ['{"type": "book",', 
     '"title": "sometitle",', 
     '"author": [{"name": "somename"}],', 
     '"year": "2000",', 
     '"identifier": [{"type": "ISBN", "id": "1234567890"}],', 
     '"publisher": "somepublisher"}', '',
     '{"type": "book",', 
     '"title": "sometitle2",', 
     '"author": [{"name": "somename2"}],', 
     '"year": "2001",', 
     '"identifier": [{"type": "ISBN", "id": "1234567890"}],', 
     '"publisher": "somepublisher"}', '']

b = "[%s]" % ''.join([',' if i == '' else i for i in a ]).strip(',')
data = json.loads(b)
df = pd.DataFrame(data)

print(df)

   type       title                   author  year  \
0  book   sometitle   [{'name': 'somename'}]  2000   
1  book  sometitle2  [{'name': 'somename2'}]  2001   

                               identifier      publisher  
0  [{'type': 'ISBN', 'id': '1234567890'}]  somepublisher  
1  [{'type': 'ISBN', 'id': '1234567890'}]  somepublisher

相关问题 更多 >