来自不同长度列表列表的数据帧

2024-10-03 02:39:39 发布

您现在位置:Python中文网/ 问答频道 /正文

如何将如下列表转换为具有5列的数据帧

[[['30/09/2015', 'C', 'ETERNITON NM H', '1,73', '400']],
 [['05/08/2019', 'C', 'CIELOON NM', '7,75', '500'],
  ['05/08/2019', 'C', 'M.DIASBRANCOON NM', '39,40', '100'],
  ['05/08/2019', 'C', 'M.DIASBRANCOON NM', '39,40', '100'],
  ['05/08/2019', 'C', 'M.DIASBRANCOON NM', '39,40', '100']],
 [['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '9'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'WEGON EJ NM H', '30,88', '99']],
 [['16/12/2019', 'C', 'IRBBRASIL REON NM', '36,72', '100'],
  ['16/12/2019', 'C', 'ITAUUNIBANCOON EJ N1', '31,45', '200']]]

Blockquote


Tags: 数据列表ejnmblockquoten1reondiasbrancoon
3条回答

使用pandas explode展平记录,然后创建数据帧

import pandas as pd
lst = [[['30/09/2015', 'C', 'ETERNITON NM H', '1,73', '400']],
 [['05/08/2019', 'C', 'CIELOON NM', '7,75', '500'],
  ['05/08/2019', 'C', 'M.DIASBRANCOON NM', '39,40', '100'],
  ['05/08/2019', 'C', 'M.DIASBRANCOON NM', '39,40', '100'],
  ['05/08/2019', 'C', 'M.DIASBRANCOON NM', '39,40', '100']],
 [['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '9'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
  ['25/03/2015', 'C', 'WEGON EJ NM H', '30,88', '99']],
 [['16/12/2019', 'C', 'IRBBRASIL REON NM', '36,72', '100'],
  ['16/12/2019', 'C', 'ITAUUNIBANCOON EJ N1', '31,45', '200']]]
df = pd.DataFrame(list(pd.Series(lst).explode()))
print(df)

只需展平列表即可获得行,然后转换为数据帧-

import pandas as pd

flat = [row for item in l for row in item]
df = pd.DataFrame(flat, columns=['A','B','C','D','E'])
print(df)
             A  B                     C      D    E
0   30/09/2015  C        ETERNITON NM H   1,73  400
1   05/08/2019  C            CIELOON NM   7,75  500
2   05/08/2019  C     M.DIASBRANCOON NM  39,40  100
3   05/08/2019  C     M.DIASBRANCOON NM  39,40  100
4   05/08/2019  C     M.DIASBRANCOON NM  39,40  100
5   25/03/2015  C          CETIPON NM H  31,17   10
6   25/03/2015  C          CETIPON NM H  31,17    9
7   25/03/2015  C          CETIPON NM H  31,17   10
8   25/03/2015  C          CETIPON NM H  31,17   10
9   25/03/2015  C          CETIPON NM H  31,17   10
10  25/03/2015  C          CETIPON NM H  31,17   10
11  25/03/2015  C          CETIPON NM H  31,17   10
12  25/03/2015  C          CETIPON NM H  31,17   10
13  25/03/2015  C          CETIPON NM H  31,17   10
14  25/03/2015  C          CETIPON NM H  31,17   10
15  25/03/2015  C         WEGON EJ NM H  30,88   99
16  16/12/2019  C     IRBBRASIL REON NM  36,72  100
17  16/12/2019  C  ITAUUNIBANCOON EJ N1  31,45  200

规范化原始数据并创建df

import pandas as pd

data = [[['30/09/2015', 'C', 'ETERNITON NM H', '1,73', '400']],
        [['05/08/2019', 'C', 'CIELOON NM', '7,75', '500'],
         ['05/08/2019', 'C', 'M.DIASBRANCOON NM', '39,40', '100'],
         ['05/08/2019', 'C', 'M.DIASBRANCOON NM', '39,40', '100'],
         ['05/08/2019', 'C', 'M.DIASBRANCOON NM', '39,40', '100']],
        [['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
         ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '9'],
         ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
         ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
         ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
         ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
         ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
         ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
         ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
         ['25/03/2015', 'C', 'CETIPON NM H', '31,17', '10'],
         ['25/03/2015', 'C', 'WEGON EJ NM H', '30,88', '99']],
        [['16/12/2019', 'C', 'IRBBRASIL REON NM', '36,72', '100'],
         ['16/12/2019', 'C', 'ITAUUNIBANCOON EJ N1', '31,45', '200']]]
lst = []
for entry in data:
    for sub in entry:
        lst.append(sub)
df = pd.DataFrame(data=lst, columns=['A', 'B', 'C', 'D', 'E'])
print(df)

输出

             A  B                     C      D    E
0   30/09/2015  C        ETERNITON NM H   1,73  400
1   05/08/2019  C            CIELOON NM   7,75  500
2   05/08/2019  C     M.DIASBRANCOON NM  39,40  100
3   05/08/2019  C     M.DIASBRANCOON NM  39,40  100
4   05/08/2019  C     M.DIASBRANCOON NM  39,40  100
5   25/03/2015  C          CETIPON NM H  31,17   10
6   25/03/2015  C          CETIPON NM H  31,17    9
7   25/03/2015  C          CETIPON NM H  31,17   10
8   25/03/2015  C          CETIPON NM H  31,17   10
9   25/03/2015  C          CETIPON NM H  31,17   10
10  25/03/2015  C          CETIPON NM H  31,17   10
11  25/03/2015  C          CETIPON NM H  31,17   10
12  25/03/2015  C          CETIPON NM H  31,17   10
13  25/03/2015  C          CETIPON NM H  31,17   10
14  25/03/2015  C          CETIPON NM H  31,17   10
15  25/03/2015  C         WEGON EJ NM H  30,88   99
16  16/12/2019  C     IRBBRASIL REON NM  36,72  100
17  16/12/2019  C  ITAUUNIBANCOON EJ N1  31,45  200

相关问题 更多 >