使用python将数据组织到适当的列中

[{"id":0,"prediction_type":"CONV_PROBABILITY","calibration_factor":0.906556,"inte cept":-2.410414,"advMatchTypeId":-0.239877,"atsId":-0.135568,"deviceTypeId":0.439130,"dmaCode":-0.251728,"keywordId":0.442240}]

intercept -2.41041 advMatchTypeId -0.23987 deviceTypeId 0.37839 dmaCode -0.53552 keywordId 0.44224 intercept -2.41041 advMatchTypeId -0.23987 atsId 0.80708 deviceTypeId -0.19573 dmaCode -0.69982 keywordId 0.44224

data_all = cur.fetchall() for i in range(len(data_all)): col = 0 data_one = ''.join(data_all[i]) raw_coef = data_one.split(',')[1:len(data_all)] for j in range(len(raw_coef)): raw = ''.join(raw_coef[j]) raw = re.sub('"|}|{|[|]|', '', raw)[:-1] raw = raw.split(":") for k in range(len(raw)): worksheet.write(i, col, raw[k], align_left) feature.append(raw[0]) # for unique values col+=1

1条回答

网友

1楼 · 发布于 2024-10-02 00:22:48

您可以跳过所有解析并使用pandas：

import pandas

如果查询结果已经是Python中的dict列表，那么它将把查询结果读入数据帧。在

^{pr2}$

如果您真的有字符串，可以使用read_json：

data_all_str = """[{"id":0,"prediction_type":"CONV_PROBABILITY","calibration_factor":0.906556,"intercept":-2.410414,"advMatchTypeId":-0.239877,"atsId":-0.135568,"deviceTypeId":0.439130,"dmaCode":-0.251728,"keywordId":0.442240}]"""
df = pandas.read_json(data_all_str)

进一步的思考使我明白你的data_all实际上是一个dict列表的列表，类似于：

data_all_lol = [data_all_list, data_all_list]

在这种情况下，需要在传递到DataFrame之前连接列表：

df = pandas.DataFrame(sum(data_all_lol, []))

这将以普通标题+值格式写入：

df.to_csv('filename.csv') # you can also use to_excel

如果您的最终目标只是获得所有特性的方法，pandas可以直接使用任意数量的列，正确处理缺失值：

df.mean()

给予

advMatchTypeId       -0.239877
atsId                -0.135568
calibration_factor    0.906556
deviceTypeId          0.439130
dmaCode              -0.251728
id                    0.000000
intercept            -2.410414
keywordId             0.442240

注意歧义

在OP中，很难知道data_all的类型，因为您显示的片段在字面语法中看起来像一个dict列表，但是您会说“我拉的列是一个字符串”。在

请注意以下IPython会话中输入的表示方式之间的差异：

In [15]: data_all_str
Out[15]: '[{"id":0,"prediction_type":"CONV_PROBABILITY","calibration_factor":0.906556,"intercept":-2.410414,"advMatchTypeId":-0.239877,"atsId":-0.135568,"deviceTypeId":0.439130,"dmaCode":-0.251728,"keywordId":0.442240}]'

In [16]: data_all_list
Out[16]:
[{'advMatchTypeId': -0.239877,
  'atsId': -0.135568,
  'calibration_factor': 0.906556,
  'deviceTypeId': 0.43913,
  'dmaCode': -0.251728,
  'id': 0,
  'intercept': -2.410414,
  'keywordId': 0.44224,
  'prediction_type': 'CONV_PROBABILITY'}]

相关问题更多 >

编程相关推荐

热门问题

热门文章