如何为关联规则分析(apriori)对数据帧进行热编码

2024-09-28 05:28:33 发布

您现在位置:Python中文网/ 问答频道 /正文

我得到了一个类似于杂货清单的数据框架:

import pandas as pd

data = {'Produce':  ['Brocolli', 'Spinach','Spinach','Lettuce','Brocolli','Lettuce','Lettuce',],
        'Dairy': ['Milk', '','Milk','Cheese','Milk','Yogurt','Yogurt',],
        'Beverage': ['', '','Orange Juice','Soda','Soda','Orange juice','',],
        'Fruit': ['Brocolli', 'Spinach','Spinach','Lettuce','Brocolli','Lettuce','Lettuce',],
        'Poultry': ['Chicken Tender', 'Chicken Breasts','Chicken Tender','Chicken Thigh','Chicken Breasts','','Chicken Breasts',],
        'Deli': ['Turkey Breasts', 'Ham','Ham','','','Turkey Breasts','',],
       }

df = pd.DataFrame (data, columns = ['Produce','Dairy','Beverage','Fruit','Deli'])

df

如何执行一个热编码来转换此数据帧,以便在其上运行apriori(据我所知,基本上所有不同的值都是列标签,并且值都被布尔值替换)


Tags: 数据datapdorangechickenmilkproducelettuce

热门问题