当值完全位于字符串fomat中时,如何将json转换为数据帧

2024-10-03 17:19:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试将数据从json转换为dataframe。我的儿子

{"data":"key=IAfpK, age=58, key=WNVdi, age=64, key=jp9zt, age=47, key=0Sr4C, age=68, key=CGEqo, age=76,
key=IxKVQ, age=79, key=eD221, age=29, key=XZbHV, age=32, key=k1SN5, age=88, key=4SCsU, age=65, key=q3kG6,
age=33, key=MGQpf, age=13, key=Kj6xW, age=14, key=tg2VM, age=30, key=WSnCU, age=24, key=f1Vvz, age=46, }

我想创建一个以key和age为列的数据框。我已经解析了str并提取了key,value,创建了dict,然后转换成dataframe。我知道熊猫有几个内在的功能让我们的生活更轻松。有没有这样的方法或更简单的方法来创建数据帧

r = requests.get('https://coderbyte.com/api/challenges/json/age-counting')
input_str = (r.json()['data'])

input_str_split = input_str.split(',')
age_dict = {}
i = 0
while i < len(input_str_split) - 2:
    key = input_str_split[i].split('=')[1]
    value = input_str_split[i+1].split('=')[1]
    age_dict[key] = value
    i += 2

data = pd.DataFrame(age_dict.items(),columns = ['Item','Age'])

Tags: 数据方法keyjsondataframeinputagedata
3条回答

不幸的是,您的输出是错误的

这里有一个答案

import requests
import re

r = requests.get('https://coderbyte.com/api/challenges/json/age-counting')
input_str = (r.json()['data'])
input_str_split = input_str.split(', ')

key_pattern = re.compile("key\=.*")
age_pattern = re.compile("age\=.*")

key_list = [x[4:] for x in input_str_split if key_pattern.match(x)]
age_list = [x[4:] for x in input_str_split if age_pattern.match(x)]

data = pd.DataFrame({'Item': key_list, 'Age': age_list})

输出为

      Item Age
0    IAfpK  58
1    WNVdi  64
2    jp9zt  47
3    0Sr4C  68
4    CGEqo  76
..     ...  ..
295  lRf1j  13
296  0iJGV  50
297  cFCfU   5
298  J8an1  48
299  dkSlj   5

这是一个你可以尝试的解决方案

zip(split_[::2], split_[1::2])将产生

key=IAfpK age=58, key=WNVdi age=64 & so on..

import pandas as pd

split_ = data.split(",")

df = pd.DataFrame(
    {"Item": i.split("=")[-1], "Age": j.split("=")[-1]}
    for i, j in zip(split_[::2], split_[1::2])
)

     Item Age
0   IAfpK  58
1   WNVdi  64
2   jp9zt  47
3   0Sr4C  68
    ...
    ...

您可以尝试list-conprehension,然后使用data[::2]选择每2个元素:

data = [x.split("=")[1] for x in input_str.split(", ")]
df = pd.DataFrame({"age": data[1::2], "key": data[::2]})

print(df)
#     age    key
# 0    58  IAfpK
# 1    64  WNVdi
# 2    47  jp9zt
# 3    68  0Sr4C
# 4    76  CGEqo
# ..   ..    ...
# 295  13  lRf1j
# 296  50  0iJGV
# 297   5  cFCfU
# 298  48  J8an1
# 299   5  dkSlj

解释

  1. 使用^{}input_str.split(", ")分割数据以标识每个元素
  2. 分解每个元素以选择=之后的值:[x.split("=")[1] for x in input_str.split(", ")]
  3. 通过每两个元素选择一个来创建数据帧:df = pd.DataFrame({"age": data[1::2], "key": data[::2]})

完整插图

r = requests.get('https://coderbyte.com/api/challenges/json/age-counting')
input_str = r.json().get('data')

print(input_str.split(", "))
# ['key=IAfpK', 'age=58', 'key=WNVdi', 'age=64', ... 'key=dkSlj', 'age=5']

print([x.split("=") for x in input_str.split(", ")])
# [['key', 'IAfpK'], ['age', '58'], ['key', 'WNVdi'], ['age', '64'],  ... , ['key', 'dkSlj'], ['age', '5']]

print([x.split("=")[1] for x in input_str.split(", ")])
# ['IAfpK', '58', 'WNVdi', '64', ..., 'dkSlj', '5']

data = [x.split("=")[1] for x in input_str.split(", ")]

print(data[1::2])
# ['58', '64', ... , '5']
df = pd.DataFrame({"age": data[1::2], "key": data[::2]})
print(df)
#     age    key
# 0    58  IAfpK
# 1    64  WNVdi
# 2    47  jp9zt
# 3    68  0Sr4C
# 4    76  CGEqo
# ..   ..    ...
# 295  13  lRf1j
# 296  50  0iJGV
# 297   5  cFCfU
# 298  48  J8an1
# 299   5  dkSlj

# [300 rows x 2 columns]

相关问题 更多 >