我有一个包含2000多行的大型数据集,我想将其转换为特定的Json格式。我在一个示例数据集上尝试了这段代码
我尝试使用to_json,to_dict,但它以通用格式提供输出
import pandas as pd
from collections import defaultdict
data = [['food', 'vegatables', 10], ['food', 'fruits', 5], ['food', 'pulses', 12], ['cloth', 'shirts',2], ['cloth', 'trousers', 6], ['books', 'notebook', 3], ['pens', 'roller', 4], ['pens', 'ball', 3]]
df = pd.DataFrame(data, columns = ['Items', 'Subitem', 'Quantity'])
labels = defaultdict(int)
labels1 = defaultdict(int)
for cat in df["Items"]:
labels[cat] += 1
for sub in df["Subitem"]:
labels1[sub] += 1
check = [{"item": i, "weight": labels[i], 'groups':[{"subitem":j, "weight": labels1[j], "group" : [] } for j in labels1] } for i in labels]
check
我得到这样的输出
[{'item': 'food',
'weight': 3,
'groups': [{'subitem': 'vegatables', 'weight': 1, 'group': []},
{'subitem': 'fruits', 'weight': 1, 'group': []},
{'subitem': 'pulses', 'weight': 1, 'group': []},
{'subitem': 'shirts', 'weight': 1, 'group': []},
{'subitem': 'trousers', 'weight': 1, 'group': []},
{'subitem': 'notebook', 'weight': 1, 'group': []},
{'subitem': 'roller', 'weight': 1, 'group': []},
{'subitem': 'ball', 'weight': 1, 'group': []}]},
{'item': 'cloth',
'weight': 2,
'groups': [{'subitem': 'vegatables', 'weight': 1, 'group': []},
{'subitem': 'fruits', 'weight': 1, 'group': []},
{'subitem': 'pulses', 'weight': 1, 'group': []},
{'subitem': 'shirts', 'weight': 1, 'group': []},
{'subitem': 'trousers', 'weight': 1, 'group': []},
{'subitem': 'notebook', 'weight': 1, 'group': []},
{'subitem': 'roller', 'weight': 1, 'group': []},
{'subitem': 'ball', 'weight': 1, 'group': []}]},
{'item': 'books',
'weight': 1,
'groups': [{'subitem': 'vegatables', 'weight': 1, 'group': []},
{'subitem': 'fruits', 'weight': 1, 'group': []},
{'subitem': 'pulses', 'weight': 1, 'group': []},
{'subitem': 'shirts', 'weight': 1, 'group': []},
{'subitem': 'trousers', 'weight': 1, 'group': []},
{'subitem': 'notebook', 'weight': 1, 'group': []},
{'subitem': 'roller', 'weight': 1, 'group': []},
{'subitem': 'ball', 'weight': 1, 'group': []}]},
{'item': 'pens',
'weight': 2,
'groups': [{'subitem': 'vegatables', 'weight': 1, 'group': []},
{'subitem': 'fruits', 'weight': 1, 'group': []},
{'subitem': 'pulses', 'weight': 1, 'group': []},
{'subitem': 'shirts', 'weight': 1, 'group': []},
{'subitem': 'trousers', 'weight': 1, 'group': []},
{'subitem': 'notebook', 'weight': 1, 'group': []},
{'subitem': 'roller', 'weight': 1, 'group': []},
{'subitem': 'ball', 'weight': 1, 'group': []}]}]
但是我想要一个只包含与该项相关的子项的输出
[{'item': 'food',
'weight': 3,
'groups': [
{'subitem': 'vegatables', 'weight': 10, 'group': []},
{'subitem': 'fruits', 'weight': 5, 'group': []},
{'subitem': 'pulses', 'weight': 12, 'group': []}]},
{'item': 'cloth',
'weight': 2,
'groups': [
{'subitem': 'shirts', 'weight': 2, 'group': []},
{'subitem': 'trousers', 'weight': 6, 'group': []}]},
{'item': 'books',
'weight': 1,
'groups': [
{'subitem': 'notebook', 'weight': 3, 'group': []}]},
{'item': 'pens',
'weight': 2,
'groups': [
{'subitem': 'roller', 'weight': 4, 'group': []},
{'subitem': 'ball', 'weight': 3, 'group': []}]}]
如果a想要这样的输出(项目的权重是子项目权重的累积),应该怎么做
[{'item': 'food',
'weight': 27,
'groups': [
{'subitem': 'vegatables', 'weight': 10, 'group': []},
{'subitem': 'fruits', 'weight': 5, 'group': []},
{'subitem': 'pulses', 'weight': 12, 'group': []}]},
{'item': 'cloth',
'weight': 8,
'groups': [
{'subitem': 'shirts', 'weight': 2, 'group': []},
{'subitem': 'trousers', 'weight': 6, 'group': []}]},
{'item': 'books',
'weight': 3,
'groups': [
{'subitem': 'notebook', 'weight': 3, 'group': []}]},
{'item': 'pens',
'weight': 7,
'groups': [
{'subitem': 'roller', 'weight': 4, 'group': []},
{'subitem': 'ball', 'weight': 3, 'group': []}]}]
可以将^{} 和^{} 与
list comprehension
一起使用输出
相关问题 更多 >
编程相关推荐