规范化Python字典以适应数据库表结构

import pandas as pd from pandas.io.json import json_normalize pd.set_option('display.max_columns', 7) survey_resp = {'responses': [ {'values': {'QID1_1': 4, 'QID1_2': 13, 'QID2': 1, '_recordId': 'R_3L6WGfdsfsdNBR', 'status': 1, 'userLanguage': 'EN'}}, {'values': {'QID2': 4, 'QID3': 1, '_recordId': 'R_Zm0ZvwqeqwetSBBT', 'status': 0, 'userLanguage': 'EN'}}, {'values': {'QID4_TEXT': 'test comment a', 'QID5_TEXT': 'test comment b', 'QID6_TEXT': 'test comment c', 'QID7_TEXT': 'test comment d', '_recordId': 'R_1d01j456hgfgh26oVe', 'status': 0, 'userLanguage': 'EN'}}]}

values.QID1_1 values.QID1_2 values.QID2 ... values._recordId values.status values.userLanguage 0 4.0 13.0 1.0 ... R_3L6WGfdsfsdNBR 1 EN 1 NaN NaN 4.0 ... R_Zm0ZvwqeqwetSBBT 0 EN 2 NaN NaN NaN ... R_1d01j456hgfgh26oVe 0 EN

values.question values.question_answer ... values._recordId values.status values.userLanguage 0 QID1_1 4 ... R_3L6WGfdsfsdNBR 1 EN 1 QID1_2 13 ... R_3L6WGfdsfsdNBR 1 EN 2 QID2 1 ... R_3L6WGfdsfsdNBR 1 EN 3 QID2 4 ... R_Zm0ZvwqeqwetSBBT 0 EN 4 QID3 1 ... R_Zm0ZvwqeqwetSBBT 0 EN 5 QID4_TEXT 'test comment a' ... R_1d01j456hgfgh26oVe 0 EN 6 QID5_TEXT 'test comment b' ... R_1d01j456hgfgh26oVe 0 EN 7 QID6_TEXT 'test comment c' ... R_1d01j456hgfgh26oVe 0 EN 8 QID7_TEXT 'test comment d' ... R_1d01j456hgfgh26oVe 0 EN

1条回答

网友
1楼 · 发布于 2024-09-28 21:11:40

假设您的字典结构对于'responses', 'values', '_recordId', 'status', 'userLanguage'键是静态的，那么这些键可以是硬编码的。你知道吗
从'responses'键中提取'values'值；从这些结果中弹出公共项；将公共项值附加到其余项中；存储在可以馈送给熊猫的容器中。你知道吗
d = survey_resp values = operator.itemgetter('values') r = d['responses'] rows = [] for thing in map(values, r): (r_id,status,userLang) = thing.pop('_recordId'),thing.pop('status'),thing.pop('userLanguage') for item in thing.items(): row = item + (r_id,status,userLang) # print(row) rows.append(row) # print('****************')
In [10]: rows Out[10]: [('QID1_1', 4, 'R_3L6WGfdsfsdNBR', 1, 'EN'), ('QID1_2', 13, 'R_3L6WGfdsfsdNBR', 1, 'EN'), ('QID2', 1, 'R_3L6WGfdsfsdNBR', 1, 'EN'), ('QID2', 4, 'R_Zm0ZvwqeqwetSBBT', 0, 'EN'), ('QID3', 1, 'R_Zm0ZvwqeqwetSBBT', 0, 'EN'), ('QID4_TEXT', 'test comment a', 'R_1d01j456hgfgh26oVe', 0, 'EN'), ('QID5_TEXT', 'test comment b', 'R_1d01j456hgfgh26oVe', 0, 'EN'), ('QID6_TEXT', 'test comment c', 'R_1d01j456hgfgh26oVe', 0, 'EN'), ('QID7_TEXT', 'test comment d', 'R_1d01j456hgfgh26oVe', 0, 'EN')] In [17]: columns = ['question','question_answer','recordId','status','userLanguage'] In [18]: df = pd.DataFrame(rows, columns=columns) In [19]: print(df) question question_answer recordId status userLanguage 0 QID1_1 4 R_3L6WGfdsfsdNBR 1 EN 1 QID1_2 13 R_3L6WGfdsfsdNBR 1 EN 2 QID2 1 R_3L6WGfdsfsdNBR 1 EN 3 QID2 4 R_Zm0ZvwqeqwetSBBT 0 EN 4 QID3 1 R_Zm0ZvwqeqwetSBBT 0 EN 5 QID4_TEXT test comment a R_1d01j456hgfgh26oVe 0 EN 6 QID5_TEXT test comment b R_1d01j456hgfgh26oVe 0 EN 7 QID6_TEXT test comment c R_1d01j456hgfgh26oVe 0 EN 8 QID7_TEXT test comment d R_1d01j456hgfgh26oVe 0 EN
至于效率，它比不起作用的东西更有效率。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章