如何将嵌套的json转换为datafrmae?

2024-10-02 12:38:39 发布

您现在位置:Python中文网/ 问答频道 /正文

我从我的数据库中导出了原始数据集,它是一个嵌套的json结构。因此,基于此,我想将这种数据集转换成熊猫数据帧,但是,我不知道如何将这种结构转换成我想要的数据帧。你知道吗

我已经练习了一些关于使用json\u normalize处理kaggle中的嵌套json数据结构的教程,但是在我的例子中,我无法将json结构实现为我想要的dataframe。你知道吗

我想要实现的数据帧是这样的

下面是我导出的嵌套json结构

[
    {
        "tripId" : "19062521016",
        "driveDisstance" : 0,
        "devicetime" : "2019-06-26 06:02:30",
        "speeds" : "0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,1,2,4,11,13,13,14,14,14,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0",
        "locations" : "[{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0}]",
        "dangerousValues" : "[{\"quickTick\":0},{\"quicRatio\":1023},{\"quicCount\":0},{\"brakeTick\":0},{\"brakeRatio\":1023},{\"brakeCount\":0},{\"overSpeedRatio\":1023},{\"overSpeedCount\":0}]"
    },
    {
        "tripId" : "19062521016",
        "driveDisstance" : 0,
        "devicetime" : "2019-06-26 06:03:30",
        "speeds" : "0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0",
        "locations" : "[{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0}]",
    ...,
]

我希望输出如下。你知道吗

tripId | driveDisstance | devicetime | speeds | qucikTick | quickRatio | quickCount | brakeTick | brakeRatio | brakeCount | overSpeedRatio | overSpeedCount

作为列

一个tripId有60行,包括60个速度值,6个位置值和危险值。你知道吗


Tags: 数据json结构latitudelocationslongitudespeedstripid
2条回答
  1. 在最初的帖子中有一些拼写错误和由此产生的不一致,所以在这个回复中,为了说明,我使用了数据定义的拼写。

  2. 您可以使用将编码的字符串“解包”为JSON,例如使用以下过滤器:

map( (.locations |= fromjson)
     | if has("dangerousValues") then .dangerousValues |= fromjson else . end)
  1. 您可以进一步生成整个表,例如使用制表符分隔的值,如下所示:
.[]
| .locations |= fromjson
| .dangerousValues |= if . == null then [] else fromjson end
| [.tripId,.driveDisstance,.devicetime,.speeds]
+ (.dangerousValues
   | add
   | [.quickTick, .quicRatio, .quicCount, .brakeTick, .brakeRatio, .brakeCount, .overSpeedRatio, .overSpeedCount] )
| @tsv

对发布的数据使用-r命令行选项,上面的jq过滤器将生成:

19062521016 0   2019-06-26 06:02:30 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,1,2,4,11,13,13,14,14,14,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0   0   1023    0   0   1023    0   1023    0
19062521016 0   2019-06-26 06:03:30 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0                             
  1. 如果.dangerousValues数组中项目的顺序与输出中所需的顺序相同,则可以将上面(3)中的过滤器简化为:
.[]
| .locations |= fromjson
| [.tripId,.driveDisstance,.devicetime,.speeds]
  + [.dangerousValues|fromjson[][]]?
| @tsv

数据帧可以从数组的数组中构建,每个内部数组都是一行。您的数据超出了json_normalize的容量,因此我将手动处理它:

演示:

data = [
    {
        "tripId" : "19062521016",
        "driveDisstance" : 0,
        "devicetime" : "2019-06-26 06:02:30",
        "speeds" : "0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,1,2,4,11,13,13,14,14,14,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0",
        "locations" : "[{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0}]",
        "dangerousValues" : "[{\"quickTick\":0},{\"quicRatio\":1023},{\"quicCount\":0},{\"brakeTick\":0},{\"brakeRatio\":1023},{\"brakeCount\":0},{\"overSpeedRatio\":1023},{\"overSpeedCount\":0}]"
    },
    {
        "tripId" : "19062521016",
        "driveDisstance" : 0,
        "devicetime" : "2019-06-26 06:03:30",
        "speeds" : "0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0",
        "locations" : "[{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0},{\"latitude\":0,\"longitude\":0}]",
        "dangerousValues" : "[{\"quickTick\":0},{\"quicRatio\":1023},{\"quicCount\":0},{\"brakeTick\":0},{\"brakeRatio\":1023},{\"brakeCount\":0},{\"overSpeedRatio\":1023},{\"overSpeedCount\":0}]"
    }
]

# pre processing
for elt in data:
    elt['dangerousValues'] = dict(((k,v) for item in
                                       json.loads(elt['dangerousValues'])
                                       for k,v in item.items()))
    elt['speeds'] = elt['speeds'].split(',')

# build dataframe data
dfdata = [[elt["tripId"], elt["driveDisstance"],elt["devicetime"], speed,
           elt["dangerousValues"]["quickTick"],
           elt["dangerousValues"]["quicRatio"],
           elt["dangerousValues"]["quicCount"],
           elt["dangerousValues"]["brakeTick"],
           elt["dangerousValues"]["brakeRatio"],
           elt["dangerousValues"]["brakeCount"],
           elt["dangerousValues"]["overSpeedRatio"],
           elt["dangerousValues"]["overSpeedCount"]]
          for elt in data for speed in elt["speeds"]]

# build dataframe
df = pd.DataFrame(dfdata, columns=["tripId","driveDisstance",
                                  "devicetime","speed",
                                  "quickTick","quicRatio","quicCount",
                                  "brakeTick","brakeRatio","brakeCount",
                                  "overSpeedRatio","overSpeedCount"])

我给出:

          tripId  driveDisstance           devicetime speed  quickTick  quicRatio  quicCount  brakeTick  brakeRatio  brakeCount  overSpeedRatio  overSpeedCount
0    19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
1    19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
2    19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
3    19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
4    19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
5    19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
6    19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
7    19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
8    19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
9    19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
10   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
11   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
12   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
13   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
14   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
15   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
16   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
17   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
18   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
19   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
20   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
21   19062521016               0  2019-06-26 06:02:30     1          0       1023          0          0        1023           0            1023               0
22   19062521016               0  2019-06-26 06:02:30     2          0       1023          0          0        1023           0            1023               0
23   19062521016               0  2019-06-26 06:02:30     1          0       1023          0          0        1023           0            1023               0
24   19062521016               0  2019-06-26 06:02:30     2          0       1023          0          0        1023           0            1023               0
25   19062521016               0  2019-06-26 06:02:30     4          0       1023          0          0        1023           0            1023               0
26   19062521016               0  2019-06-26 06:02:30    11          0       1023          0          0        1023           0            1023               0
27   19062521016               0  2019-06-26 06:02:30    13          0       1023          0          0        1023           0            1023               0
28   19062521016               0  2019-06-26 06:02:30    13          0       1023          0          0        1023           0            1023               0
29   19062521016               0  2019-06-26 06:02:30    14          0       1023          0          0        1023           0            1023               0
30   19062521016               0  2019-06-26 06:02:30    14          0       1023          0          0        1023           0            1023               0
31   19062521016               0  2019-06-26 06:02:30    14          0       1023          0          0        1023           0            1023               0
32   19062521016               0  2019-06-26 06:02:30     8          0       1023          0          0        1023           0            1023               0
33   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
34   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
35   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
36   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
37   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
38   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
39   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
40   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
41   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
42   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
43   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
44   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
45   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
46   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
47   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
48   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
49   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
50   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
51   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
52   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
53   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
54   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
55   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
56   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
57   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
58   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
59   19062521016               0  2019-06-26 06:02:30     0          0       1023          0          0        1023           0            1023               0
60   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
61   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
62   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
63   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
64   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
65   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
66   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
67   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
68   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
69   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
70   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
71   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
72   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
73   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
74   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
75   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
76   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
77   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
78   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
79   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
80   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
81   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
82   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
83   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
84   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
85   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
86   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
87   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
88   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
89   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
90   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
91   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
92   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
93   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
94   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
95   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
96   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
97   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
98   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
99   19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
100  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
101  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
102  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
103  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
104  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
105  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
106  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
107  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
108  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
109  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
110  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
111  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
112  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
113  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
114  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
115  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
116  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
117  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
118  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0
119  19062521016               0  2019-06-26 06:03:30     0          0       1023          0          0        1023           0            1023               0

相关问题 更多 >

    热门问题