如何将作为字典的列表元素卸载到dataframe中(使用它的第一个值作为前缀)

2024-09-25 02:37:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我目前正在学习python(使用pandas)来处理数据分析。我在DataCamp上做了一些课程,并试图将我学到的知识应用到一个实际问题中:我想监测加拿大的新冠病毒-19病例

为此,我从Apify API获取数据,该API返回一个json,然后创建一个数据帧。数据帧结构如下所示:

<class 'pandas.core.frame.DataFrame'>
Int64Index: 57 entries, 0 to 56
Data columns (total 9 columns):
infected              57 non-null float64
deceased              57 non-null float64
infectedByRegion      57 non-null object
measureDate           57 non-null object
measureTime           57 non-null object

感染和死亡列包含加拿大的总数

在infectedByRegion列中,每行都有一个字典列表,如下所示:

   [{'region': 'Canada', 'infectedCount': '6258', 'deceasedCount': '61'},
 {'region': 'Newfoundland and Labrador',
  'infectedCount': '135',
  'deceasedCount': '0'},
 {'region': 'Prince Edward Island',
  'infectedCount': '11',
  'deceasedCount': '0'},
 {'region': 'Nova Scotia', 'infectedCount': '122', 'deceasedCount': '0'},
 {'region': 'New Brunswick', 'infectedCount': '66', 'deceasedCount': '0'},
 {'region': 'Quebec', 'infectedCount': '2840', 'deceasedCount': '22'},
 {'region': 'Ontario', 'infectedCount': '1355', 'deceasedCount': '19'},
 {'region': 'Manitoba', 'infectedCount': '72', 'deceasedCount': '1'},
 {'region': 'Saskatchewan', 'infectedCount': '134', 'deceasedCount': '0'},
 {'region': 'Alberta', 'infectedCount': '621', 'deceasedCount': '2'},
 {'region': 'British Columbia', 'infectedCount': '884', 'deceasedCount': '17'},
 {'region': 'Yukon', 'infectedCount': '4', 'deceasedCount': '0'},
 {'region': 'Northwest Territories',
  'infectedCount': '1',
  'deceasedCount': '0'},
 {'region': 'Nunavut', 'infectedCount': '0', 'deceasedCount': '0'},
 {'region': 'Repatriated travellers',
  'infectedCount': '13',
  'deceasedCount': '0'}]

我试图在数据框的末尾为每个地区的感染和死亡人数创建列。例如:

... measureTime   Quebec_infectedCount   Quebec_deceasedCount   Ontario_infectedCount  ...
... 22:30:15      2840                   22                     1355                   ...

我尝试使用json_normalize函数,但它给我带来了一个错误:

AttributeError: 'list' object has no attribute 'values'

然后我尝试在这里查看stackoverflow,发现了以下链接:

Python: json_normalize a pandas series gives TypeError

这对我来说不起作用,因为它只创建了一个名为region的列,在数据帧末尾的每一行中只包含“Canada”作为一个值

... measureDate     measureTime     region
... 2020-03-29      22:30:15        Canada
... 2020-03-30      22:30:15        Canada

有没有人能帮我或者给我指一个合适的帖子,帮助我解决问题?由于我还是一个初学者,我试着搜索了几个小时,但我想我甚至不知道如何准确地界定我的问题,但我真的想学习如何处理这种情况

提前谢谢


Tags: columns数据apijsonpandasobjectnullregion
2条回答
  • 给定以下数据帧,其中一列(infectedByRegion)是字典列表

infectedByRegion的目录列表

data =  [{'region': 'Canada', 'infectedCount': '6258', 'deceasedCount': '61'},
         {'region': 'Newfoundland and Labrador', 'infectedCount': '135', 'deceasedCount': '0'},
         {'region': 'Prince Edward Island', 'infectedCount': '11', 'deceasedCount': '0'},
         {'region': 'Nova Scotia', 'infectedCount': '122', 'deceasedCount': '0'},
         {'region': 'New Brunswick', 'infectedCount': '66', 'deceasedCount': '0'},
         {'region': 'Quebec', 'infectedCount': '2840', 'deceasedCount': '22'},
         {'region': 'Ontario', 'infectedCount': '1355', 'deceasedCount': '19'},
         {'region': 'Manitoba', 'infectedCount': '72', 'deceasedCount': '1'},
         {'region': 'Saskatchewan', 'infectedCount': '134', 'deceasedCount': '0'},
         {'region': 'Alberta', 'infectedCount': '621', 'deceasedCount': '2'},
         {'region': 'British Columbia', 'infectedCount': '884', 'deceasedCount': '17'},
         {'region': 'Yukon', 'infectedCount': '4', 'deceasedCount': '0'},
         {'region': 'Northwest Territories', 'infectedCount': '1', 'deceasedCount': '0'},
         {'region': 'Nunavut', 'infectedCount': '0', 'deceasedCount': '0'},
         {'region': 'Repatriated travellers', 'infectedCount': '13', 'deceasedCount': '0'}]

代表性数据帧

import pandas as pd
from ast import literal_eval

df = pd.DataFrame({'measureDate': ['2020-03-29', '2020-03-30', '2020-03-31'], 'measureTime': ['22:30:15', '21:30:16', '20:56:29'],
                   'infectedByRegion': [data, data, data], 'infected': [12516, 13000, 14000], 'deceased': [122, 133, 143]})


  measureDate measureTime  infected  deceased                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           infectedByRegion
0  2020-03-29    22:30:15     12516       122  [{'region': 'Canada', 'infectedCount': '6258', 'deceasedCount': '61'}, {'region': 'Newfoundland and Labrador', 'infectedCount': '135', 'deceasedCount': '0'}, {'region': 'Prince Edward Island', 'infectedCount': '11', 'deceasedCount': '0'}, {'region': 'Nova Scotia', 'infectedCount': '122', 'deceasedCount': '0'}, {'region': 'New Brunswick', 'infectedCount': '66', 'deceasedCount': '0'}, {'region': 'Quebec', 'infectedCount': '2840', 'deceasedCount': '22'}, {'region': 'Ontario', 'infectedCount': '1355', 'deceasedCount': '19'}, {'region': 'Manitoba', 'infectedCount': '72', 'deceasedCount': '1'}, {'region': 'Saskatchewan', 'infectedCount': '134', 'deceasedCount': '0'}, {'region': 'Alberta', 'infectedCount': '621', 'deceasedCount': '2'}, {'region': 'British Columbia', 'infectedCount': '884', 'deceasedCount': '17'}, {'region': 'Yukon', 'infectedCount': '4', 'deceasedCount': '0'}, {'region': 'Northwest Territories', 'infectedCount': '1', 'deceasedCount': '0'}, {'region': 'Nunavut', 'infectedCount': '0', 'deceasedCount': '0'}, {'region': 'Repatriated travellers', 'infectedCount': '13', 'deceasedCount': '0'}]
1  2020-03-30    21:30:16     13000       133  [{'region': 'Canada', 'infectedCount': '6258', 'deceasedCount': '61'}, {'region': 'Newfoundland and Labrador', 'infectedCount': '135', 'deceasedCount': '0'}, {'region': 'Prince Edward Island', 'infectedCount': '11', 'deceasedCount': '0'}, {'region': 'Nova Scotia', 'infectedCount': '122', 'deceasedCount': '0'}, {'region': 'New Brunswick', 'infectedCount': '66', 'deceasedCount': '0'}, {'region': 'Quebec', 'infectedCount': '2840', 'deceasedCount': '22'}, {'region': 'Ontario', 'infectedCount': '1355', 'deceasedCount': '19'}, {'region': 'Manitoba', 'infectedCount': '72', 'deceasedCount': '1'}, {'region': 'Saskatchewan', 'infectedCount': '134', 'deceasedCount': '0'}, {'region': 'Alberta', 'infectedCount': '621', 'deceasedCount': '2'}, {'region': 'British Columbia', 'infectedCount': '884', 'deceasedCount': '17'}, {'region': 'Yukon', 'infectedCount': '4', 'deceasedCount': '0'}, {'region': 'Northwest Territories', 'infectedCount': '1', 'deceasedCount': '0'}, {'region': 'Nunavut', 'infectedCount': '0', 'deceasedCount': '0'}, {'region': 'Repatriated travellers', 'infectedCount': '13', 'deceasedCount': '0'}]
2  2020-03-31    20:56:29     14000       143  [{'region': 'Canada', 'infectedCount': '6258', 'deceasedCount': '61'}, {'region': 'Newfoundland and Labrador', 'infectedCount': '135', 'deceasedCount': '0'}, {'region': 'Prince Edward Island', 'infectedCount': '11', 'deceasedCount': '0'}, {'region': 'Nova Scotia', 'infectedCount': '122', 'deceasedCount': '0'}, {'region': 'New Brunswick', 'infectedCount': '66', 'deceasedCount': '0'}, {'region': 'Quebec', 'infectedCount': '2840', 'deceasedCount': '22'}, {'region': 'Ontario', 'infectedCount': '1355', 'deceasedCount': '19'}, {'region': 'Manitoba', 'infectedCount': '72', 'deceasedCount': '1'}, {'region': 'Saskatchewan', 'infectedCount': '134', 'deceasedCount': '0'}, {'region': 'Alberta', 'infectedCount': '621', 'deceasedCount': '2'}, {'region': 'British Columbia', 'infectedCount': '884', 'deceasedCount': '17'}, {'region': 'Yukon', 'infectedCount': '4', 'deceasedCount': '0'}, {'region': 'Northwest Territories', 'infectedCount': '1', 'deceasedCount': '0'}, {'region': 'Nunavut', 'infectedCount': '0', 'deceasedCount': '0'}, {'region': 'Repatriated travellers', 'infectedCount': '13', 'deceasedCount': '0'}]

explode将目录列表分成单独的行

  • 不清楚infectedByRegion列在数据帧中是类型list还是str,因此可能需要修复
# convert str to list; may not be required
df.infectedByRegion = df.infectedByRegion.apply(literal_eval)

# combine columns to datetime the drop them
df['DateTime'] = pd.to_datetime(df.measureDate + ' ' + df.measureTime)
df.drop(columns=['measureDate', 'measureTime'], inplace=True)

# explode infectedByRedion; pandas >= 0.25
df = df.explode('infectedByRegion')

|    | infectedByRegion                                                                      |   infected |   deceased | DateTime            |
| -:|:                                           |     -:|     -:|:          |
|  0 | {'region': 'Canada', 'infectedCount': '6258', 'deceasedCount': '61'}                  |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'Newfoundland and Labrador', 'infectedCount': '135', 'deceasedCount': '0'} |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'Prince Edward Island', 'infectedCount': '11', 'deceasedCount': '0'}       |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'Nova Scotia', 'infectedCount': '122', 'deceasedCount': '0'}               |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'New Brunswick', 'infectedCount': '66', 'deceasedCount': '0'}              |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'Quebec', 'infectedCount': '2840', 'deceasedCount': '22'}                  |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'Ontario', 'infectedCount': '1355', 'deceasedCount': '19'}                 |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'Manitoba', 'infectedCount': '72', 'deceasedCount': '1'}                   |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'Saskatchewan', 'infectedCount': '134', 'deceasedCount': '0'}              |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'Alberta', 'infectedCount': '621', 'deceasedCount': '2'}                   |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'British Columbia', 'infectedCount': '884', 'deceasedCount': '17'}         |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'Yukon', 'infectedCount': '4', 'deceasedCount': '0'}                       |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'Northwest Territories', 'infectedCount': '1', 'deceasedCount': '0'}       |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'Nunavut', 'infectedCount': '0', 'deceasedCount': '0'}                     |      12516 |        122 | 2020-03-29 22:30:15 |
|  0 | {'region': 'Repatriated travellers', 'infectedCount': '13', 'deceasedCount': '0'}     |      12516 |        122 | 2020-03-29 22:30:15 |
|  1 | {'region': 'Canada', 'infectedCount': '6258', 'deceasedCount': '61'}                  |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'Newfoundland and Labrador', 'infectedCount': '135', 'deceasedCount': '0'} |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'Prince Edward Island', 'infectedCount': '11', 'deceasedCount': '0'}       |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'Nova Scotia', 'infectedCount': '122', 'deceasedCount': '0'}               |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'New Brunswick', 'infectedCount': '66', 'deceasedCount': '0'}              |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'Quebec', 'infectedCount': '2840', 'deceasedCount': '22'}                  |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'Ontario', 'infectedCount': '1355', 'deceasedCount': '19'}                 |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'Manitoba', 'infectedCount': '72', 'deceasedCount': '1'}                   |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'Saskatchewan', 'infectedCount': '134', 'deceasedCount': '0'}              |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'Alberta', 'infectedCount': '621', 'deceasedCount': '2'}                   |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'British Columbia', 'infectedCount': '884', 'deceasedCount': '17'}         |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'Yukon', 'infectedCount': '4', 'deceasedCount': '0'}                       |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'Northwest Territories', 'infectedCount': '1', 'deceasedCount': '0'}       |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'Nunavut', 'infectedCount': '0', 'deceasedCount': '0'}                     |      13000 |        133 | 2020-03-30 21:30:16 |
|  1 | {'region': 'Repatriated travellers', 'infectedCount': '13', 'deceasedCount': '0'}     |      13000 |        133 | 2020-03-30 21:30:16 |
|  2 | {'region': 'Canada', 'infectedCount': '6258', 'deceasedCount': '61'}                  |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'Newfoundland and Labrador', 'infectedCount': '135', 'deceasedCount': '0'} |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'Prince Edward Island', 'infectedCount': '11', 'deceasedCount': '0'}       |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'Nova Scotia', 'infectedCount': '122', 'deceasedCount': '0'}               |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'New Brunswick', 'infectedCount': '66', 'deceasedCount': '0'}              |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'Quebec', 'infectedCount': '2840', 'deceasedCount': '22'}                  |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'Ontario', 'infectedCount': '1355', 'deceasedCount': '19'}                 |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'Manitoba', 'infectedCount': '72', 'deceasedCount': '1'}                   |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'Saskatchewan', 'infectedCount': '134', 'deceasedCount': '0'}              |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'Alberta', 'infectedCount': '621', 'deceasedCount': '2'}                   |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'British Columbia', 'infectedCount': '884', 'deceasedCount': '17'}         |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'Yukon', 'infectedCount': '4', 'deceasedCount': '0'}                       |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'Northwest Territories', 'infectedCount': '1', 'deceasedCount': '0'}       |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'Nunavut', 'infectedCount': '0', 'deceasedCount': '0'}                     |      14000 |        143 | 2020-03-31 20:56:29 |
|  2 | {'region': 'Repatriated travellers', 'infectedCount': '13', 'deceasedCount': '0'}     |      14000 |        143 | 2020-03-31 20:56:29 |

将字典键转换为列

df_concat = pd.concat([df, df.infectedByRegion.apply(pd.Series)], axis=1).drop('infectedByRegion', axis=1)

|    |   infected |   deceased | DateTime            | region                    |   infectedCount |   deceasedCount |
| -:|     -:|     -:|:          |:             |        :|        :|
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | Canada                    |            6258 |              61 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | Newfoundland and Labrador |             135 |               0 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | Prince Edward Island      |              11 |               0 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | Nova Scotia               |             122 |               0 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | New Brunswick             |              66 |               0 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | Quebec                    |            2840 |              22 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | Ontario                   |            1355 |              19 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | Manitoba                  |              72 |               1 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | Saskatchewan              |             134 |               0 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | Alberta                   |             621 |               2 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | British Columbia          |             884 |              17 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | Yukon                     |               4 |               0 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | Northwest Territories     |               1 |               0 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | Nunavut                   |               0 |               0 |
|  0 |      12516 |        122 | 2020-03-29 22:30:15 | Repatriated travellers    |              13 |               0 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | Canada                    |            6258 |              61 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | Newfoundland and Labrador |             135 |               0 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | Prince Edward Island      |              11 |               0 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | Nova Scotia               |             122 |               0 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | New Brunswick             |              66 |               0 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | Quebec                    |            2840 |              22 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | Ontario                   |            1355 |              19 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | Manitoba                  |              72 |               1 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | Saskatchewan              |             134 |               0 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | Alberta                   |             621 |               2 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | British Columbia          |             884 |              17 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | Yukon                     |               4 |               0 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | Northwest Territories     |               1 |               0 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | Nunavut                   |               0 |               0 |
|  1 |      13000 |        133 | 2020-03-30 21:30:16 | Repatriated travellers    |              13 |               0 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | Canada                    |            6258 |              61 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | Newfoundland and Labrador |             135 |               0 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | Prince Edward Island      |              11 |               0 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | Nova Scotia               |             122 |               0 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | New Brunswick             |              66 |               0 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | Quebec                    |            2840 |              22 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | Ontario                   |            1355 |              19 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | Manitoba                  |              72 |               1 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | Saskatchewan              |             134 |               0 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | Alberta                   |             621 |               2 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | British Columbia          |             884 |              17 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | Yukon                     |               4 |               0 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | Northwest Territories     |               1 |               0 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | Nunavut                   |               0 |               0 |
|  2 |      14000 |        143 | 2020-03-31 20:56:29 | Repatriated travellers    |              13 |               0 |

转向所需的格式

df_pivot = df_concat.pivot(index='DateTime', columns='region', values=['infectedCount', 'deceasedCount'])

# rename multi-index column names
df_pivot.columns = [f'{col[1]}_{col[0]}' for col in df_pivot.columns.values]

# output form
                    Alberta_infectedCount British Columbia_infectedCount Canada_infectedCount Manitoba_infectedCount New Brunswick_infectedCount Newfoundland and Labrador_infectedCount Northwest Territories_infectedCount Nova Scotia_infectedCount Nunavut_infectedCount Ontario_infectedCount Prince Edward Island_infectedCount Quebec_infectedCount Repatriated travellers_infectedCount Saskatchewan_infectedCount Yukon_infectedCount Alberta_deceasedCount British Columbia_deceasedCount Canada_deceasedCount Manitoba_deceasedCount New Brunswick_deceasedCount Newfoundland and Labrador_deceasedCount Northwest Territories_deceasedCount Nova Scotia_deceasedCount Nunavut_deceasedCount Ontario_deceasedCount Prince Edward Island_deceasedCount Quebec_deceasedCount Repatriated travellers_deceasedCount Saskatchewan_deceasedCount Yukon_deceasedCount
DateTime                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
2020-03-29 22:30:15                   621                            884                 6258                     72                          66                                     135                                   1                       122                     0                  1355                                 11                 2840                                   13                        134                   4                     2                             17                   61                      1                           0                                       0                                   0                         0                     0                    19                                  0                   22                                    0                          0                   0
2020-03-30 21:30:16                   621                            884                 6258                     72                          66                                     135                                   1                       122                     0                  1355                                 11                 2840                                   13                        134                   4                     2                             17                   61                      1                           0                                       0                                   0                         0                     0                    19                                  0                   22                                    0                          0                   0
2020-03-31 20:56:29                   621                            884                 6258                     72                          66                                     135                                   1                       122                     0                  1355                                 11                 2840                                   13                        134                   4                     2                             17                   61                      1                           0                                       0                                   0                         0                     0                    19                                  0                   22                                    0                          0                   0

您可以使用已有的列表创建数据帧

df = pd.DataFrame(
    [
        {"region": "Canada", "infectedCount": "6258", "deceasedCount": "61"},
        {
            "region": "Newfoundland and Labrador",
            "infectedCount": "135",
            "deceasedCount": "0",
        },
        {"region": "Prince Edward Island", "infectedCount": "11", "deceasedCount": "0"},
        {"region": "Nova Scotia", "infectedCount": "122", "deceasedCount": "0"},
        {"region": "New Brunswick", "infectedCount": "66", "deceasedCount": "0"},
        {"region": "Quebec", "infectedCount": "2840", "deceasedCount": "22"},
        {"region": "Ontario", "infectedCount": "1355", "deceasedCount": "19"},
        {"region": "Manitoba", "infectedCount": "72", "deceasedCount": "1"},
        {"region": "Saskatchewan", "infectedCount": "134", "deceasedCount": "0"},
        {"region": "Alberta", "infectedCount": "621", "deceasedCount": "2"},
        {"region": "British Columbia", "infectedCount": "884", "deceasedCount": "17"},
        {"region": "Yukon", "infectedCount": "4", "deceasedCount": "0"},
        {"region": "Northwest Territories", "infectedCount": "1", "deceasedCount": "0"},
        {"region": "Nunavut", "infectedCount": "0", "deceasedCount": "0"},
        {
            "region": "Repatriated travellers",
            "infectedCount": "13",
            "deceasedCount": "0",
        },
    ]
)
print(df)
                      region infectedCount deceasedCount
0                      Canada          6258            61
1   Newfoundland and Labrador           135             0
2        Prince Edward Island            11             0
3                 Nova Scotia           122             0
4               New Brunswick            66             0
5                      Quebec          2840            22
6                     Ontario          1355            19
7                    Manitoba            72             1
8                Saskatchewan           134             0
9                     Alberta           621             2
10           British Columbia           884            17
11                      Yukon             4             0
12      Northwest Territories             1             0
13                    Nunavut             0             0
14     Repatriated travellers            13             0

让我们添加日期和时间,并将日期、时间和区域设置为索引

df["measureDate"] = "2020-03-29"
df["measureTime"] = "22:30:15"

df = df.set_index(["measureDate", "measureTime", "region"])
print(df)
measureDate measureTime region                                               
2020-03-29  22:30:15    Canada                             6258            61
                        Newfoundland and Labrador           135             0
                        Prince Edward Island                 11             0
                        Nova Scotia                         122             0
                        New Brunswick                        66             0
                        Quebec                             2840            22
                        Ontario                            1355            19
                        Manitoba                             72             1
                        Saskatchewan                        134             0
                        Alberta                             621             2
                        British Columbia                    884            17
                        Yukon                                 4             0
                        Northwest Territories                 1             0
                        Nunavut                               0             0
                        Repatriated travellers               13             0

接下来,我们将索引中的region level=2取消堆叠到列,交换级别,并对列进行排序

df = df.unstack(level=2)
df.swaplevel(axis=1).sort_index(axis=1)

这里印的不好

    region                                                Alberta                British Columbia
                                  deceasedCount     infectedCount   deceasedCount   infectedCount
measureDate     measureTime                 
2020-03-29  22:30:15                          2            621                  17                      
884

相关问题 更多 >