TypeError:使用groupby生成组合时

2024-06-28 11:06:04 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试实现Apriori关联规则挖掘算法。我转而使用生成器来创建候选项集对。当我尝试创建组合时,得到“TypeError:”int“object is not subscriptable”

下面是orders数据框的一个示例

https://puu.sh/DdJSj/0b6401efac.png

from collections import Counter
from itertools import groupby, combinations
import pandas

#now we will use a generator instead of dicts to save memory
def generate_pairs(orders, k):
    #generate item list for order
    for id, order in groupby(orders, lambda x: x[0]):
        items = [item[0] for item in order]

    #generate pairs for each itemlist
    for pair in combinations(items, k):
        yield pair

def itemcount(iterable):
    if type(iterable) == pandas.core.series.Series:
        return iterable.value_counts().rename("count")
    else:
        return pandas.Series(Counter(iterable)).rename("count")
pair_generator = generate_pairs(orders, 2)
print(pair_generator)
pairs = itemcount(pair_generator).to_frame("count(AB)")

结果

Traceback (most recent call last):
  File "C:/Users/Cosco/PycharmProjects/untitled/finalp/final.py", line 183, in <module>
    rules = generate_rules(transactions, supp_percent)
  File "C:/Users/Cosco/PycharmProjects/untitled/finalp/final.py", line 80, in generate_rules
    pairs = itemcount(pair_generator).to_frame("count(AB)")
  File "C:/Users/Cosco/PycharmProjects/untitled/finalp/final.py", line 33, in itemcount
    print(type(pandas.Series(Counter(iterable)).rename("count")))
  File "C:\Users\Cosco\Miniconda3\lib\collections\__init__.py", line 534, in __init__
    self.update(*args, **kwds)
  File "C:\Users\Cosco\Miniconda3\lib\collections\__init__.py", line 621, in update
    _count_elements(self, iterable)
  File "C:/Users/Cosco/PycharmProjects/untitled/finalp/final.py", line 22, in generate_pairs
    for id, order in groupby(orders, lambda x: x[0]):
  File "C:/Users/Cosco/PycharmProjects/untitled/finalp/final.py", line 22, in <lambda>
    for id, order in groupby(orders, lambda x: x[0]):
TypeError: 'int' object is not subscriptable

我做错什么了?我知道x应该是一个iterable,但是当我调试时,x是单个item\u id

编辑:当generate_pairs()更改如下时,generator工作(不正确):

def generate_pairs(orders, k):
    orders = orders.reset_index().values
    #generate item list for order
    for id, order in groupby(orders, lambda x: x[0]):
        itemlist = [item[1] for item in order]

    #generate pairs for each itemlist
    for pair in combinations(itemlist, k):
        yield pair

Tags: inpyforlineorderiterableitemgenerator
1条回答
网友
1楼 · 发布于 2024-06-28 11:06:04

你假设熊猫DataFrames像列表一样工作,但它们不是

您可以这样修改程序:

def generate_pairs(orders, k):
    orders = orders.values.tolist()
    ...

但请记住,您将无法访问generate_pairs中的标签或格式

注意:您也可以逃脱orders = orders.values的惩罚-这将避免O(n)复制数据(从numpy到PyList),但如果您希望orders的类型正好是一个列表,则可能会出现问题

相关问题 更多 >