计算多标签分类问题的标签数时出错

2024-05-11 03:07:17 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图计算多标签分类问题的标签分布。请查找CSV文件中包含的示例数据

filenames   labels
tt3302594.jpg   ['deer']
tt2377194.jpg   ['deer']
tt2309762.jpg   ['dog', 'deer']
tt2870808.jpg   ['cat', 'deer']
tt2551396.jpg   ['cat', 'dog', 'deer']
tt4008652.jpg   ['dog']
tt2926810.jpg   ['deer']
tt3531604.jpg   ['dog', 'deer']
tt2290739.jpg   ['cat', 'deer']

我希望绘制一个seaborn图,该图在X轴上显示各个标签,在Y轴上显示它们的计数值

代码如下:

import numpy as np
import pandas as pd
import seaborn as sns
from collections import Counter

train = pd.read_csv('example.csv')    # reading the csv file
meta = pd.DataFrame(train, columns=['filenames', 'labels'])
print(f'Found {len(meta)} images')
meta.sample(9)
all_labels = [label for lbs in meta['labels'] for label in lbs]
labels_count = Counter(all_labels)
ax = sns.countplot(all_labels, order=[k for k, _ in labels_count.most_common()], log=True)
ax.set_title('Number of images with a class label')
ax.set_ylim(1E2, 1E4)
ax.set_xticklabels(ax.get_xticklabels(), rotation=90);

上面的代码,而不是在计算标签中的每个字符(如“”、“d”、“e”、“r”等)时计算带有类标签的图像的数量


Tags: csvimportforlabelsas标签allax
1条回答
网友
1楼 · 发布于 2024-05-11 03:07:17

您需要使用literal_eval将列表形成的字符串解析为实际列表(此外,对于发布的示例,y lims将使条消失,因此注释),如下所示:

import numpy as np
import pandas as pd
import seaborn as sns
from collections import Counter
import ast

train = pd.read_csv('example.csv')    # reading the csv file
meta = pd.DataFrame(train, columns=['filenames', 'labels'])
print(f'Found {len(meta)} images')
meta.sample(9)
meta['labels'] = [ast.literal_eval(x) for x in meta['labels'].values] 
all_labels = [label for lbs in meta['labels'] for label in lbs]
labels_count = Counter(all_labels)
ax = sns.countplot(all_labels, order=[k for k, _ in labels_count.most_common()], log=True)
ax.set_title('Number of images with a class label')
# ax.set_ylim(1E2, 1E4)
ax.set_xticklabels(ax.get_xticklabels(), rotation=90);

enter image description here

相关问题 更多 >