Python数据帧代码优化

2024-05-20 00:01:27 发布

您现在位置:Python中文网/ 问答频道 /正文

实际上,我希望用python优化我的代码。我点击我的ES(弹性搜索)并得到json响应,现在我迭代json响应并将它们存储为列表,将它们作为列附加到dataframe中

unmtchd_ESdata={"Response from Elastic seaach"}

    for i in range(len(unmtchd_ESdata['aggregations']['filtered']['POSCode']['buckets'])):
        list6.append(unmtchd_ESdata['avg'])
        list7.append(unmtchd_ESdata['key'])
        ....
        ....

    mkt_df=pd.DataFrame()
    mkt_df["market_avg_total_sales_count"]=dict6
    mkt_df["pos_code"]=dict7
    ...
    ....

最后,结果将具有mktèu df dataframe,所有列都将按照列表中所附加内容的顺序赋值。如果一个列表假设list6附加了像[01200000129009803003]这样的值,那么它将以数据格式以下面的形式出现,其余的也同样适用

   market_avg_total_sales_count     pos_code 
0                        329.75  01200000129 
1                         15.00  00980030003 

现在我的问题是我读了太多的变量,我想把它们作为数据帧的值,显然有N个列表使我的程序高效,因为所有这些操作都在内存中。 关于如何以更少的空间和时间复杂度复制这样的场景有什么建议吗

编辑: 在此处添加我的json结构:

{
  "took": 28,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 12170,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "filtered": {
      "doc_count": 5,
      "POSCode": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
          {
            "key": "01200000129",
            "doc_count": 4,
            "POSCodeModifier": {
              "doc_count_error_upper_bound": 0,
              "sum_other_doc_count": 0,
              "buckets": [
                {
                  "key": "0",
                  "doc_count": 4,
                  "CSP": {
                    "doc_count_error_upper_bound": 0,
                    "sum_other_doc_count": 0,
                    "buckets": [
                      {
                        "key": "5555",
                        "doc_count": 4,
                        "per_stock": {
                          "buckets": [
                            {
                              "key_as_string": "2018-02-26",
                              "key": 1519603200000,
                              "doc_count": 0,
                              "avg_week_qty_sales": {
                                "value": 0
                              }
                            },
                            {
                              "key_as_string": "2018-03-05",
                              "key": 1520208000000,
                              "doc_count": 1,
                              "avg_week_qty_sales": {
                                "value": 10
                              }
                            },
                            {
                              "key_as_string": "2018-03-12",
                              "key": 1520812800000,
                              "doc_count": 1,
                              "avg_week_qty_sales": {
                                "value": 300
                              }
                            },
                            {
                              "key_as_string": "2018-03-19",
                              "key": 1521417600000,
                              "doc_count": 1,
                              "avg_week_qty_sales": {
                                "value": 1000
                              }
                            },
                            {
                              "key_as_string": "2018-03-26",
                              "key": 1522022400000,
                              "doc_count": 1,
                              "avg_week_qty_sales": {
                                "value": 9
                              }
                            }
                          ]
                        },
                        "market_week_metrics": {
                          "count": 4,
                          "min": 9,
                          "max": 1000,
                          "avg": 329.75,
                          "sum": 1319,
                          "sum_of_squares": 1090181,
                          "variance": 163810.1875,
                          "std_deviation": 404.7347124969639,
                          "std_deviation_bounds": {
                            "upper": 1139.2194249939278,
                            "lower": -479.71942499392776
                          }
                        }
                      }
                    ]
                  }
                }
              ]
            }
          },
          {
            "key": "00980030003",
            "doc_count": 1,
            "POSCodeModifier": {
              "doc_count_error_upper_bound": 0,
              "sum_other_doc_count": 0,
              "buckets": [
                {
                  "key": "0",
                  "doc_count": 1,
                  "CSP": {
                    "doc_count_error_upper_bound": 0,
                    "sum_other_doc_count": 0,
                    "buckets": [
                      {
                        "key": "5555",
                        "doc_count": 1,
                        "per_stock": {
                          "buckets": [
                            {
                              "key_as_string": "2018-02-26",
                              "key": 1519603200000,
                              "doc_count": 0,
                              "avg_week_qty_sales": {
                                "value": 0
                              }
                            },
                            {
                              "key_as_string": "2018-03-05",
                              "key": 1520208000000,
                              "doc_count": 1,
                              "avg_week_qty_sales": {
                                "value": 15
                              }
                            },
                            {
                              "key_as_string": "2018-03-12",
                              "key": 1520812800000,
                              "doc_count": 0,
                              "avg_week_qty_sales": {
                                "value": 0
                              }
                            },
                            {
                              "key_as_string": "2018-03-19",
                              "key": 1521417600000,
                              "doc_count": 0,
                              "avg_week_qty_sales": {
                                "value": 0
                              }
                            },
                            {
                              "key_as_string": "2018-03-26",
                              "key": 1522022400000,
                              "doc_count": 0,
                              "avg_week_qty_sales": {
                                "value": 0
                              }
                            }
                          ]
                        },
                        "market_week_metrics": {
                          "count": 1,
                          "min": 15,
                          "max": 15,
                          "avg": 15,
                          "sum": 15,
                          "sum_of_squares": 225,
                          "variance": 0,
                          "std_deviation": 0,
                          "std_deviation_bounds": {
                            "upper": 15,
                            "lower": 15
                          }
                        }
                      }
                    ]
                  }
                }
              ]
            }
          }
        ]
      }
    }
  }
}

我试图获取的值

for i in range(len(unmtchd_ESdata['aggregations']['filtered']['POSCode']['buckets'])):
            list6.append(unmtchd_ESdata['aggregations']['filtered']['POSCode']['buckets'][i]['POSCodeModifier']['buckets'][0]['CSP']['buckets'][0]['market_week_metrics']['avg'])
            list7.append(unmtchd_ESdata['aggregations']['filtered']['POSCode']['buckets'][i]['key'])
            list8.append(unmtchd_ESdata['aggregations']['filtered']['POSCode']['buckets'][i]['POSCodeModifier']['buckets'][0]['CSP']['buckets'][0]['market_week_metrics']['max']-unmtchd_ESdata['aggregations']['filtered']['POSCode']['buckets'][i]['POSCodeModifier']['buckets'][0]['CSP']['buckets'][0]['market_week_metrics']['min'])
            list9.append(unmtchd_ESdata['aggregations']['filtered']['POSCode']['buckets'][i]['POSCodeModifier']['buckets'][0]['CSP']['buckets'][0]['market_week_metrics']['max'])
            list10.append(unmtchd_ESdata['aggregations']['filtered']['POSCode']['buckets'][i]['POSCodeModifier']['buckets'][0]['CSP']['buckets'][0]['market_week_metrics']['min'])

Tags: keystringdocvalueascountfilteredavg
1条回答
网友
1楼 · 发布于 2024-05-20 00:01:27

您可以只创建一个列表并附加一个具有ndim的元组,其中n是每次迭代的列数,例如:

for i in range(3):
    some_list.append((i, i+3))

结果:

[(0, 3), (1, 4), (2, 5)]

将其传递给数据帧会得到:

pd.DataFrame(some_list, columns=['col1', 'col2'])
   col1  col2
0     0     3
1     1     4
2     2     5

试着让它适应你的解决方案。你知道吗

相关问题 更多 >