为多个列绘制多个条形图

2024-04-26 19:38:58 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据集,看起来大致如下表所示。 我需要为每一列TS1TS5创建一个条形图,用于计算该列中每个项目的数量。这些项目是以下项目之一:NOT_SEEN{}{}和110140之间的数值由2分隔(因此110112114等)

我已经找到了一种方法,可以很好地实现这一点,但我想问的是,是否有一种方法可以创建一个循环或其他东西,这样我就不必复制粘贴相同的代码5次(对于5列)

这就是我所尝试和努力的:

num_range = list(range(110,140, 2))
OUTCOMES = ['NOT_SEEN', 'NOT_ABLE', 'HIGH_BAR']
OUTCOMES.extend([str(num) for num in num_range])
OUTCOMES = CategoricalDtype(OUTCOMES, ordered = True)

fig, ax =plt.subplots(2, 3, sharey=True)
fig.tight_layout(pad=3)

下面是我复制了5次的内容,只更改了标题(Testing 1Testing 2等)和TS1{}。。(在第一行)

df["outcomes"] = df["TS1"].astype(OUTCOMES)
bpt=sns.countplot(x= "outcomes", data=df, palette='GnBu', ax=ax[0,0])
plt.setp(bpt.get_xticklabels(), rotation=60, size=6, ha='right')
bpt.set(xlabel='')
bpt.set_title('Testing 1')

那么下面的代码就在上面的“5”个实例下面

ax[1,2].set_visible(False)
plt.show()

我确信有一种更好的方法可以做到这一点,但我对这一切都是新手

此外,我需要确保条形图的条从左到右排列为:NOT_SEEN{}{}和110112114等等

使用Python2.7(不幸不是我的选择)和pandas 0.24.2

+----+------+------+----------+----------+----------+----------+----------+
| ID | VIEW | YEAR | TS1      | TS2      | TS3      | TS4      | TS5      |
+----+------+------+----------+----------+----------+----------+----------+
| AA | NO   | 2005 |          | 134      |          | HIGH_BAR |          |
+----+------+------+----------+----------+----------+----------+----------+
| AB | YES  | 2015 |          |          | NOT_SEEN |          |          |
+----+------+------+----------+----------+----------+----------+----------+
| AB | YES  | 2010 | 118      |          |          |          | NOT_ABLE |
+----+------+------+----------+----------+----------+----------+----------+
| BB | NO   | 2020 |          |          |          |          |          |
+----+------+------+----------+----------+----------+----------+----------+
| BA | YES  | 2020 |          |          |          | NOT_SEEN |          |
+----+------+------+----------+----------+----------+----------+----------+
| AA | NO   | 2010 |          |          |          |          |          |
+----+------+------+----------+----------+----------+----------+----------+
| BA | NO   | 2015 |          |          |          |          | 132      |
+----+------+------+----------+----------+----------+----------+----------+
| BB | YES  | 2010 |          | HIGH_BAR |          | 140      | NOT_ABLE |
+----+------+------+----------+----------+----------+----------+----------+
| AA | YES  | 2020 |          |          |          |          |          |
+----+------+------+----------+----------+----------+----------+----------+
| AB | NO   | 2010 |          |          |          | 112      |          |
+----+------+------+----------+----------+----------+----------+----------+
| AB | YES  | 2015 |          |          | NOT_ABLE |          | HIGH_BAR |
+----+------+------+----------+----------+----------+----------+----------+
| BB | NO   | 2020 |          |          |          | 145      |          |
+----+------+------+----------+----------+----------+----------+----------+
| BA | NO   | 2015 |          | 110      |          |          |          |
+----+------+------+----------+----------+----------+----------+----------+
| AA | YES  | 2010 | HIGH_BAR |          |          | NOT_SEEN |          |
+----+------+------+----------+----------+----------+----------+----------+
| BA | YES  | 2015 |          |          |          |          |          |
+----+------+------+----------+----------+----------+----------+----------+
| AA | NO   | 2020 |          |          |          | 118      |          |
+----+------+------+----------+----------+----------+----------+----------+
| BA | YES  | 2015 |          | 180      | NOT_ABLE |          |          |
+----+------+------+----------+----------+----------+----------+----------+
| BB | YES  | 2020 |          | NOT_SEEN |          |          | 126      |
+----+------+------+----------+----------+----------+----------+----------+

Tags: noabnotbarableaxnumyes
2条回答

通过使用^{}可以避免循环,这允许您在一个FacetGrid上绘制多个countplot

这在Python2.7.18中进行了测试(尽管它在Python3中仍然有效):

  1. ^{}TS数据导入long form

    melted = df.melt(id_vars=[], value_vars=['TS1','TS2','TS3','TS4','TS5'],
                     var_name='testing', value_name='outcome')
    
    #    testing outcome
    # 0      TS1     NaN
    # 1      TS1     NaN
    # 2      TS1     118
    # 3      TS1     NaN
    # ..     ...     ...
    # 88     TS5     NaN
    # 89     TS5     126
    
  2. 通过^{}绘制melted数据:

    g = sns.catplot(kind='count', x='outcome', col='testing',
                    col_wrap=3, order=OUTCOMES.categories,
                    data=melted, palette='GnBu_r')
    g.set_xticklabels(rotation=90)
    

    catplot output


版本:

>>> sys.version
# 2.7.18 (default, Mar 15 2021, 14:29:03) \n[GCC 10.2.0]
>>> pandas.__version__
# 0.24.2
>>> matplotlib.__version__
# 2.2.5
>>> seaborn.__version__
# 0.9.1

您可以在函数中放置打印线,并在for循环中调用它,在每次迭代中自动更改列、标题和轴:

fig, axes =plt.subplots(2, 3, sharey=True)
fig.tight_layout(pad=3)

def plotting(column, title, ax):
    df["outcomes"] = df[column].astype(OUTCOMES)
    bpt=sns.countplot(x= "outcomes", data=df, palette='GnBu', ax=ax)
    plt.setp(bpt.get_xticklabels(), rotation=60, size=6, ha='right')
    bpt.set(xlabel='')
    bpt.set_title(title)

columns = ['TS1', 'TS2', 'TS3', 'TS4', 'TS5']
titles = ['Testing 1', 'Testing 2', 'Testing 3', 'Testing 4', 'Testing 5']

for column, title, ax in zip(columns, titles, axes.flatten()):
    plotting(column, title, ax)

axes[1,2].set_visible(False)

plt.show()

enter image description here

相关问题 更多 >