我有一个数据集,看起来大致如下表所示。
我需要为每一列TS1
到TS5
创建一个条形图,用于计算该列中每个项目的数量。这些项目是以下项目之一:NOT_SEEN
{110
和140
之间的数值由2分隔(因此110
、112
、114
等)
我已经找到了一种方法,可以很好地实现这一点,但我想问的是,是否有一种方法可以创建一个循环或其他东西,这样我就不必复制粘贴相同的代码5次(对于5列)
这就是我所尝试和努力的:
num_range = list(range(110,140, 2))
OUTCOMES = ['NOT_SEEN', 'NOT_ABLE', 'HIGH_BAR']
OUTCOMES.extend([str(num) for num in num_range])
OUTCOMES = CategoricalDtype(OUTCOMES, ordered = True)
fig, ax =plt.subplots(2, 3, sharey=True)
fig.tight_layout(pad=3)
下面是我复制了5次的内容,只更改了标题(Testing 1
,Testing 2
等)和TS1
{
df["outcomes"] = df["TS1"].astype(OUTCOMES)
bpt=sns.countplot(x= "outcomes", data=df, palette='GnBu', ax=ax[0,0])
plt.setp(bpt.get_xticklabels(), rotation=60, size=6, ha='right')
bpt.set(xlabel='')
bpt.set_title('Testing 1')
那么下面的代码就在上面的“5”个实例下面
ax[1,2].set_visible(False)
plt.show()
我确信有一种更好的方法可以做到这一点,但我对这一切都是新手
此外,我需要确保条形图的条从左到右排列为:NOT_SEEN
{110
,112
,114
等等
使用Python2.7(不幸不是我的选择)和pandas 0.24.2
+----+------+------+----------+----------+----------+----------+----------+
| ID | VIEW | YEAR | TS1 | TS2 | TS3 | TS4 | TS5 |
+----+------+------+----------+----------+----------+----------+----------+
| AA | NO | 2005 | | 134 | | HIGH_BAR | |
+----+------+------+----------+----------+----------+----------+----------+
| AB | YES | 2015 | | | NOT_SEEN | | |
+----+------+------+----------+----------+----------+----------+----------+
| AB | YES | 2010 | 118 | | | | NOT_ABLE |
+----+------+------+----------+----------+----------+----------+----------+
| BB | NO | 2020 | | | | | |
+----+------+------+----------+----------+----------+----------+----------+
| BA | YES | 2020 | | | | NOT_SEEN | |
+----+------+------+----------+----------+----------+----------+----------+
| AA | NO | 2010 | | | | | |
+----+------+------+----------+----------+----------+----------+----------+
| BA | NO | 2015 | | | | | 132 |
+----+------+------+----------+----------+----------+----------+----------+
| BB | YES | 2010 | | HIGH_BAR | | 140 | NOT_ABLE |
+----+------+------+----------+----------+----------+----------+----------+
| AA | YES | 2020 | | | | | |
+----+------+------+----------+----------+----------+----------+----------+
| AB | NO | 2010 | | | | 112 | |
+----+------+------+----------+----------+----------+----------+----------+
| AB | YES | 2015 | | | NOT_ABLE | | HIGH_BAR |
+----+------+------+----------+----------+----------+----------+----------+
| BB | NO | 2020 | | | | 145 | |
+----+------+------+----------+----------+----------+----------+----------+
| BA | NO | 2015 | | 110 | | | |
+----+------+------+----------+----------+----------+----------+----------+
| AA | YES | 2010 | HIGH_BAR | | | NOT_SEEN | |
+----+------+------+----------+----------+----------+----------+----------+
| BA | YES | 2015 | | | | | |
+----+------+------+----------+----------+----------+----------+----------+
| AA | NO | 2020 | | | | 118 | |
+----+------+------+----------+----------+----------+----------+----------+
| BA | YES | 2015 | | 180 | NOT_ABLE | | |
+----+------+------+----------+----------+----------+----------+----------+
| BB | YES | 2020 | | NOT_SEEN | | | 126 |
+----+------+------+----------+----------+----------+----------+----------+
通过使用^{} 可以避免循环,这允许您在一个
FacetGrid
上绘制多个countplot
这在Python2.7.18中进行了测试(尽管它在Python3中仍然有效):
^{} 将
TS
数据导入long form:通过^{} 绘制
melted
数据:版本:
您可以在函数中放置打印线,并在for循环中调用它,在每次迭代中自动更改列、标题和轴:
相关问题 更多 >
编程相关推荐