数据帧的JSON嵌套列表

2024-05-18 17:51:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个json文件,如下所示:

    "Aveiro": {
        "Albergaria-a-Velha": {
            "candidates": [
                {
                    "effectiveCandidates": [
                        "JOSÉ OLIVEIRA SANTOS"
                    ],
                    "party": "B.E.",
                    "votes": {
                        "absoluteMajority": 0,
                        "acronym": "B.E.",
                        "constituenctyCounter": 1,
                        "mandates": 0,
                        "percentage": 1.34,
                        "presidents": 0,
                        "validVotesPercentage": 1.4,
                        "votes": 179
                    }
                },
                {
                    "effectiveCandidates": [
                        "ANTÓNIO AUGUSTO AMARAL LOUREIRO E SANTOS"
                    ],
                    "party": "CDS-PP",
                    "votes": {
                        "absoluteMajority": 1,
                        "acronym": "CDS-PP",
                        "constituenctyCounter": 1,
                        "mandates": 5,
                        "percentage": 59.7,
                        "presidents": 1,
                        "validVotesPercentage": 62.5,
                        "votes": 7970
                    }
                },
                {
                    "effectiveCandidates": [
                        "CARLOS MANUEL DA COSTA SERVEIRA VASQUES"
                    ],
                    "party": "CH",
                    "votes": {
                        "absoluteMajority": 0,
                        "acronym": "CH",
                        "constituenctyCounter": 1,
                        "mandates": 0,
                        "percentage": 1.87,
                        "presidents": 0,
                        "validVotesPercentage": 1.95,
                        "votes": 249
                    }
                },
                {
                    "effectiveCandidates": [
                        "RODRIGO MANUEL PEREIRA MARQUES LOURENÇO"
                    ],
                    "party": "PCP-PEV",
                    "votes": {
                        "absoluteMajority": 0,
                        "acronym": "PCP-PEV",
                        "constituenctyCounter": 1,
                        "mandates": 0,
                        "percentage": 1.57,
                        "presidents": 0,
                        "validVotesPercentage": 1.65,
                        "votes": 210
                    }
                },
                {
                    "effectiveCandidates": [
                        "DELFINA LISBOA MARTINS DA CUNHA"
                    ],
                    "party": "PPD/PSD",
                    "votes": {
                        "absoluteMajority": 0,
                        "acronym": "PPD/PSD",
                        "constituenctyCounter": 1,
                        "mandates": 2,
                        "percentage": 24.23,
                        "presidents": 0,
                        "validVotesPercentage": 25.37,
                        "votes": 3235
                    }
                },
                {
                    "effectiveCandidates": [
                        "JESUS MANUEL VIDINHA TOMÁS"
                    ],
                    "party": "PS",
                    "votes": {
                        "absoluteMajority": 0,
                        "acronym": "PS",
                        "constituenctyCounter": 1,
                        "mandates": 0,
                        "percentage": 6.82,
                        "presidents": 0,
                        "validVotesPercentage": 7.14,
                        "votes": 910
                    }
                }
            ],
            "parentTerritoryName": "Aveiro",
            "territoryKey": "LOCAL-010200",
            "territoryName": "Albergaria-a-Velha",
            "total_votes": {
                "availableMandates": 0,
                "blankVotes": 377,
                "blankVotesPercentage": 2.82,
                "displayMessage": null,
                "hasNoVoting": false,
                "nullVotes": 221,
                "nullVotesPercentage": 1.66,
                "numberParishes": 6,
                "numberVoters": 13351,
                "percentageVoters": 59.48
            }
        },

完整文件为here供参考

我认为这个代码会起作用

import pandas as pd 
from pandas import json_normalize
import json


with open('autarquicas_2021.json') as f:
    data = json.load(f)

df = pd.json_normalize(data)

但是,这将返回以下信息:

df.head()
Aveiro.Albergaria-a-Velha.candidates  ... Évora.Évora.total_votes.percentageVoters
0  [{'effectiveCandidates': ['JOSÉ OLIVEIRA SANTO...  ...                                    49.84

[1 rows x 4312 columns]

df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1 entries, 0 to 0
Columns: 4312 entries, Aveiro.Albergaria-a-Velha.candidates to Évora.Évora.total_votes.percentageVoters
dtypes: bool(308), float64(924), int64(1540), object(1540)
memory usage: 31.7+ KB
None

出于某种原因,代码不起作用,我的研究也没有找到解决方案,因为似乎每个json文件都有自己的想法

任何帮助都将不胜感激。提前谢谢你

免责声明:这是一个开源项目,旨在提高葡萄牙地方选举的透明度。它不会用于商业或盈利项目


Tags: jsonpartypercentagevotespresidentsacronymaveiroabsolutemajority
2条回答

递归方法:

我通常使用这个函数(a recursive approach)来做这类事情:

# Function for flattening 
# json
def flatten_json(y):
    out = {}
  
    def flatten(x, name =''):
          
        # If the Nested key-value 
        # pair is of dict type
        if type(x) is dict:
              
            for a in x:
                flatten(x[a], name + a + '_')
                  
        # If the Nested key-value
        # pair is of list type
        elif type(x) is list:
              
            i = 0
              
            for a in x:                
                flatten(a, name + str(i) + '_')
                i += 1
        else:
            out[name[:-1]] = x
  
    flatten(y)
    return out

您可以调用flatten_json来展平嵌套的json

# Driver code
print(flatten_json(data))

基于库的方法:

from flatten_json import flatten

unflat_json = {'user' :
               {'foo':
                {'UserID':0123456,
                'Email': 'foo@mail.com', 
                'friends': ['Johnny', 'Mark', 'Tom']
                }
               }
              }
  
flat_json = flatten(unflat_json)
  
print(flat_json)

您可以使用json_normalize对原始JSON格式进行一些转换

  1. 将JSON转换为列表格式。 我假设“阿维罗”为城市,“阿尔贝加里亚-维利亚”为地区。抱歉,我不熟悉这个地区,所以如果它是错误的,请重新命名的关键
res = [{**z, **{'city': x, 'district': y}} for x, y in data.items() for y, z in y.items()]

这将把键值样式的原始JSON转换为对象列表

[{
    "city": "Aveiro",
    "district": "Albergaria-a-Velha",
    "candidates": [{
        ...
}]
  1. 然后使用json_normalize
df = pd.json_normalize(res, record_path=['candidates'], meta=['total_votes', 'city', 'district'])
  1. 进一步扩展嵌套对象total_votes
df = pd.concat([df, pd.json_normalize(df['total_votes'])], axis=1)
>>> df.iloc[0]
effectiveCandidates                                      [JOSÉ OLIVEIRA SANTOS]
party                                                                      B.E.
votes.absoluteMajority                                                        0
votes.acronym                                                              B.E.
votes.constituenctyCounter                                                    1
votes.mandates                                                                0
votes.percentage                                                           1.34
votes.presidents                                                              0
votes.validVotesPercentage                                                  1.4
votes.votes                                                                 179
total_votes                   {'availableMandates': 0, 'blankVotes': 377, 'b...
city                                                                     Aveiro
district                                                     Albergaria-a-Velha
availableMandates                                                             0
blankVotes                                                                  377
blankVotesPercentage                                                       2.82
displayMessage                                                             None
hasNoVoting                                                               False
nullVotes                                                                   221
nullVotesPercentage                                                        1.66
numberParishes                                                                6
numberVoters                                                              13351
percentageVoters                                                          59.48
Name: 0, dtype: object

相关问题 更多 >

    热门问题