大量数据和循环优化

data = [ { 'make': 'dacia', 'model': 'x', 'version': 'A', 'typ': 'sedan', 'infos': [ {'id': 1, 'name': 'steering wheel problems'}, {'id': 32, 'name': 'ABS errors'} ] }, { 'make': 'nissan', 'model': 'z', 'version': 'B', 'typ': 'coupe', 'infos': [ {'id': 3,'name': 'throttle problems'}, {'id': 56, 'name': 'broken handbreak'}, {'id': 11, ;'name': 'missing seatbelts'} ] } ]

tab = [] s = 0 for ma in make: for mo in model: for ve in version: for ty in typ: s = sum([1 for k in data if k['make] == ma and k['model] == mo and k['version'] == ve and k['typ'] == ty) if s != 0: total.append({'make': i, 'model': j, 'version': i, 'typ': j, 'sum': s})

2条回答

网友

1楼 · 编辑于 2024-09-24 02:23:50

使用

Groupby在Pandas中生成组合组
Count函数计算每个组的大小
避免Python for循环，这对于JSON结构中的大型列表来说非常缓慢

正确的数据（发布时有错误，即伪“；”）

data = [
{  
   'make': 'dacia',
   'model': 'x',
   'version': 'A',
   'typ': 'sedan',
   'infos': [
            {'id': 1, 'name': 'steering wheel problems'}, 
            {'id': 32, 'name': 'ABS errors'}
   ]
},
{  
   'make': 'nissan',
   'model': 'z',
   'version': 'B',
   'typ': 'coupe',
   'infos': [
         {'id': 3,'name': 'throttle problems'}, 
         {'id': 56, 'name': 'broken handbreak'}, 
         {'id': 11, 'name': 'missing seatbelts'}
   ]
}
]

计数组合

import pandas as pd

# JSON to Pandas DataFrame
df = pd.json_normalize(data)

# Groupby desired properties and
# Count size of each group
result = df.groupby(['make', 'model', 'version', 'typ']).count()
print(result)

# Output (shows combinations of make, model, version, type and count)
                                  infos
make    model   version typ 
dacia   x       A       sedan         1
nissan  z       B       coupe         1

网友

2楼 · 编辑于 2024-09-24 02:23:50

您可以将键用作元组（4元组）并实现自己的计数器

from collections import defaultdict

res = defaultdict(int)

for i in data:
    res[i['make'],i['model'], i['version'], i['typ']] += 1

然后，您可以从该res中筛选不需要的组合。您可以使用if来检查4元组是否来自需要筛选的组合集。所以让这个线性化

编辑，也可以使用collections.Counter

from collections import Counter
res = Counter((i['make'],i['model'], i['version'], i['typ']) for i in data)

如果您有一个名为combinations的组合集，那么添加过滤器可能类似于。Python 3.8+

combinations = {your_combination_set_that_has_tuples}
res = Counter(key for i in data if (key := (i['make'],i['model'], i['version'], i['typ'])) in combinations)

相关问题更多 >

编程相关推荐

热门问题

热门文章