将Python JSON字符串转换为表格格式

2024-09-28 18:51:36 发布

您现在位置:Python中文网/ 问答频道 /正文

文章底部的脚本帮助我从JSON获取数据并将其重新定向到如下所示的信息列表中

我需要将其重新构造为表格格式,以便进行CSV转换,或者更好的做法是在将其重新构造为X/Y轴后进行MySQL插入,但是当前的JSON输出有重复出现的标题,这些标题在转换为表格时是不需要的,所以我实际上只是试图去掉“first_name”和“last_name”的响应值,并将其作为列标题,而值以行表示

对于Python还不是非常精通,但我觉得这是处理处理过程最明智的方法,任何帮助都将不胜感激

id , 9999999
first_name , Joe
last_name , Schmo
group_id , 99999
active , True
employee_number , 0
salaried , False
exempt , False
username , jschmo
email , abc123@gmail.com
email_verified , False
payroll_id ,
mobile_number , 9999999999
hire_date , 0000-00-00
term_date , 0000-00-00
last_modified , 2021-02-08T01:59:42+00:00
last_active , 2021-02-10T22:45:34+00:00
created , 2021-02-04T20:07:38+00:00
client_url , abcefg
company_name , Some Company Name
profile_image_url , https://www.gravatar.com/avatar/000000000000000000000000000
display_name ,
pronouns ,
pto_balances,
1400406 , 0
submitted_to , 2021-02-08
approved_to , 2021-02-08
manager_of_group_ids , []
require_password_change , False
pay_rate , 0
pay_interval , hour
permissions,
admin , False
mobile , True
status_box , False
reports , False
manage_timesheets , False
manage_authorization , False
manage_users , False
manage_my_timesheets , False
manage_jobcodes , False
pin_login , True
approve_timesheets , False
manage_schedules , False
external_access , False
manage_my_schedule , False
manage_company_schedules , False
view_company_schedules , False
view_group_schedules , False
manage_no_schedules , False
view_my_schedules , False
view_projects , False
manage_projects , False
time_tracking , True
customfields ,
id , 1111111
first_name , Jane
last_name , Doe
group_id , 11111
active , True
employee_number , 0
salaried , False
exempt , False
username , jdoe
email , xyz789@gmail.com
email_verified , False
payroll_id ,
mobile_number , 111111111
hire_date , 0000-00-00
term_date , 0000-00-00
last_modified , 2021-02-08T01:59:42+00:00
last_active , 2021-02-10T22:45:34+00:00
created , 2021-02-04T20:07:38+00:00
client_url , abcefg
company_name , Some Company Name
profile_image_url , https://www.gravatar.com/avatar/000000000000000000000000000
display_name ,
pronouns ,
pto_balances,
1400406 , 0
submitted_to , 2021-02-08
approved_to , 2021-02-08
manager_of_group_ids , []
require_password_change , False
pay_rate , 0
pay_interval , hour
permissions,
admin , False
mobile , True
status_box , False
reports , False
manage_timesheets , False
manage_authorization , False
manage_users , False
manage_my_timesheets , False
manage_jobcodes , False
pin_login , True
approve_timesheets , False
manage_schedules , False
external_access , False
manage_my_schedule , False
manage_company_schedules , False
view_company_schedules , False
view_group_schedules , False
manage_no_schedules , False
view_my_schedules , False
view_projects , False
manage_projects , False
time_tracking , True
customfields ,
import requests
import json
import csv



url = "https://rest.apiofsomesort.com/api/v1/users"

payload = ""
headers = {
   'Authorization': "Bearer                ",
  }

response = requests.request("GET", url, data=payload, headers=headers)

#print(response.text)

# Saves response to JSON file
emp_data = response.json()
with open('emp_data.json', 'w') as f:
    json.dump(emp_data, f)

# Loads from JSON file while iterating recursively through nested keys,values
with open('emp_data.json','r') as string:
    my_dict=json.load(string)
string.close()
def iterate_emp_data(my_dict):
    for k,v in my_dict.items():
        if(isinstance(v,dict)):
            print(k+",")
            iterate_emp_data(v)
            continue
        print(k+" , "+str(v))
iterate_emp_data(my_dict)

Tags: nameviewidfalsetrueurldatamanage
3条回答

这是问题的一部分

import requests
import json
import time




url = "https://rest.apiofsomesort.com/api/v1/users"

payload = ""
headers = {
   'Authorization': "Bearer ",
  }

response = requests.request("GET", url, data=payload, headers=headers)

#print(response.text)

# Saves response to JSON file
emp_data = response.json()
with open('emp_data.json', 'w') as f:
    json.dump(emp_data, f)

time.sleep(2)
data = json.load(open('emp_data.json'))
jtopy=json.dumps(data) #json.dumps take a dictionary as input and returns a string as output.
dict_json=json.loads(jtopy) # json.loads take a string as input and returns a dictionary as output.
print(dict_json["results"]["users"]["9999999"])

上面的代码简化了从JSON API响应中仅提取一名员工的结果

因为JSON结构的员工布局如下>

{"results": {"users": {"9999999": {"id": 9999999, "first_name": "Joe"} "9999998": {"id": 9999998, "first_name": "Destiny"}

问题是,没有为“名字”、“姓氏”、“用户名”等已知的“键”定义字典。。。但是,由于用户id值也被用作每个员工记录的唯一键,因此会出现如下错误

KeyError: 'first_name'

如果我试着做这样的事情

print(dict_json["results"]["users"]["first_name"])

因为JSON的结构是这样的

{"results": {"users": {"9999999": {"id":9999999, "first_name": "Joe"} "9999998": {"id": 9999998, "first_name": 

我确实成功地解析了单个员工名字的结果

print(my_dict["results"]["users"]["9999999"]["first_name"])

或者我可以从API响应中提取所有信息(所有员工、所有细节等) 使用

print(my_dict["results"]["users"]

我曾尝试迭代API响应,希望重新定向数据能够更好地提取每个唯一员工id键{"9999999":后的键值,但没有任何机会将大量不易转换为表格格式的CSV或MySQL插入信息浇灌掉

为了进一步了解JSON的API/示例,下面是API响应的输出示例

{
  "results": {
    "users": {
      "933849": {
        "id": 933849,
        "first_name": "Mary",
        "last_name": "Samsonite",
        "group_id": 0,
        "active": true,
        "employee_number": 0,
        "salaried": false,
        "exempt": false,
        "username": "admin",
        "email": "admin@example.com",
        "email_verified": false,
        "payroll_id": "",
        "mobile_number": "2087231456",
        "hire_date": "0000-00-00",
        "term_date": "0000-00-00",
        "last_modified": "2018-03-28T17:24:20+00:00",
        "last_active": "",
        "created": "2018-03-27T16:13:34+00:00",
        "client_url": "api_sample_output",
        "company_name": "API Sample Output Company",
        "profile_image_url": "https:\/\/www.gravatar.com\/avatar\/e64c7d89f26bd1972efa854d13d7dd61",
        "display_name": "",
        "pronouns": "",
        "pto_balances": {
          "2624351": 0,
          "2624353": 0,
          "2624355": 0
        },
        "submitted_to": "2000-01-01",
        "approved_to": "2000-01-01",
        "manager_of_group_ids": [ ],
        "require_password_change": false,
        "pay_rate": 0,
        "pay_interval": "hour",
        "permissions": {
          "admin": true,
          "mobile": true,
          "status_box": false,
          "reports": false,
          "manage_timesheets": false,
          "manage_authorization": false,
          "manage_users": false,
          "manage_my_timesheets": false,
          "manage_jobcodes": false,
          "pin_login": false,
          "approve_timesheets": false,
          "manage_schedules": false,
          "external_access": false,
          "manage_my_schedule": false,
          "manage_company_schedules": false,
          "view_company_schedules": false,
          "view_group_schedules": false,
          "manage_no_schedules": false,
          "view_my_schedules": false,
          "time_tracking": false
        },
        "customfields": ""
      },
      "933845": {
        "id": 933845,
        "first_name": "Bob",
        "last_name": "Smith",
        "group_id": 64965,
        "active": true,
        "employee_number": 0,
        "salaried": false,
        "exempt": false,
        "username": "bobsmith",
        "email": "",
        "email_verified": false,
        "payroll_id": "",
        "hire_date": "0000-00-00",
        "term_date": "0000-00-00",
        "last_modified": "2018-03-27T16:13:33+00:00",
        "last_active": "2018-03-28T20:16:39+00:00",
        "created": "2018-03-27T16:13:33+00:00",
        "client_url": "api_sample_output",
        "company_name": "API Sample Output Company",
        "profile_image_url": "",
        "display_name": "",
        "pronouns": "",
        "mobile_number": "",
        "pto_balances": {
          "2624351": 0,
          "2624353": 0,
          "2624355": 0
        },
        "submitted_to": "2000-01-01",
        "approved_to": "2000-01-01",
        "manager_of_group_ids": [ ],
        "require_password_change": false,
        "pay_rate": 0,
        "pay_interval": "hour",
        "permissions": {
          "admin": false,
          "mobile": true,
          "status_box": false,
          "reports": false,
          "manage_timesheets": false,
          "manage_authorization": false,
          "manage_users": false,
          "manage_my_timesheets": false,
          "manage_jobcodes": false,
          "pin_login": false,
          "approve_timesheets": false,
          "manage_schedules": false,
          "external_access": false,
          "manage_my_schedule": false,
          "manage_company_schedules": false,
          "view_company_schedules": false,
          "view_group_schedules": false,
          "manage_no_schedules": false,
          "view_my_schedules": false,
          "time_tracking": false
        },
        "customfields": ""
      }
    }
  },
  "more": false,
  "supplemental_data": {
    "jobcodes": {
      "2624351": {
        "id": 2624351,
        "parent_id": 0,
        "assigned_to_all": true,
        "billable": false,
        "active": true,
        "type": "pto",
        "has_children": false,
        "billable_rate": 0,
        "short_code": "",
        "name": "Sick",
        "last_modified": "2018-03-27T16:13:28+00:00",
        "created": "2018-03-27T16:13:28+00:00",
        "filtered_customfielditems": "",
        "required_customfields": [ ],
        "locations": [ ]
      },
      "2624353": {
        "id": 2624353,
        "parent_id": 0,
        "assigned_to_all": true,
        "billable": false,
        "active": true,
        "type": "pto",
        "has_children": false,
        "billable_rate": 0,
        "short_code": "",
        "name": "Vacation",
        "last_modified": "2018-03-27T16:13:28+00:00",
        "created": "2018-03-27T16:13:28+00:00",
        "filtered_customfielditems": "",
        "required_customfields": [ ],
        "locations": [ ]
      },
      "2624355": {
        "id": 2624355,
        "parent_id": 0,
        "assigned_to_all": true,
        "billable": false,
        "active": true,
        "type": "pto",
        "has_children": false,
        "billable_rate": 0,
        "short_code": "",
        "name": "Holiday",
        "last_modified": "2018-03-27T16:13:28+00:00",
        "created": "2018-03-27T16:13:28+00:00",
        "filtered_customfielditems": "",
        "required_customfields": [ ],
        "locations": [ ]
      }
    },
    "groups": {
      "64965": {
        "id": 64965,
        "active": true,
        "name": "Construction",
        "last_modified": "2018-03-27T16:13:30+00:00",
        "created": "2018-03-27T16:13:29+00:00",
        "manager_ids": [
          "933833"
        ]
      }
    }
  }
}

我建议将字典放入熊猫数据框中

或者,如果您希望汇集多行,则不使用当前函数: 数据=dict()

def iterate_emp_data(my_dict):
    for k,v in my_dict.items():
        if(isinstance(v,dict)):
            Data[k.strip()].append(v)

然后把它放在一个数据框中

相关问题 更多 >