我有下面的代码,它生成一个Seaborn条带图,然后在每个类别下写下观察值的数量。但如果在stripplot()调用中指定不同的类别顺序,则数字的顺序不正确
我需要帮助找出一种方法来排序我的nobs系列,使数字匹配的顺序,类别在猫的顺序。有什么优雅的方法吗
# import libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
# create some fake data
df = pd.DataFrame(data=None,columns=['Fuel','MW'])
df = df.astype(dtype={'Fuel':'category', 'MW':'float'})
rnd = np.random.RandomState(1)
mymean = 4 # starting mean value for use in the distribution below
mysigma = 0.8 # sigma value for use in the distribution below
myfuels = ['Bio','Coal','Hydro','Gas','Oil','Other','Solar','Wind'] # categories
mysize = 10
for myfuel in myfuels:
myMWs = list(rnd.lognormal(mean=mymean,sigma=mysigma,size=mysize)) # random generated data
newdata = pd.DataFrame({'Fuel': [myfuel]*mysize,
'MW': myMWs})
df = df.append(newdata,
ignore_index=True
)
mymean = mymean + 0.4 # increment to have different distributions
mysize = mysize + 3
# see distribution of created data
# Set up figure and axes
fig1, ax1 = plt.subplots(nrows=1, ncols=1, figsize=(14,5))
sns.set_style('darkgrid')
# define category order (same as myfuels, except Other is at end)
cat_order = ['Bio','Coal','Hydro','Gas','Oil','Solar','Wind','Other']
sns.stripplot(x = 'Fuel',
y = 'MW',
data=df,
order=cat_order,
jitter=0.35,
size=2.5,
ax = ax1,
palette='dark'
)
ax1.set_ylim([-400,5500])
# add tags for number of observations in each category
nobs = df.groupby(['Fuel'])['MW'].agg(['count'])['count'] #Is there a more elegant form of this line?
myv = -280 # get this value through inspection and trial and error
myh = 0 # first category is always at horiz position = 0
for item in nobs:
if myh == 0:
plt.text(myh, myv, 'n = ' + str(item), ha='center', fontsize=9)
else:
plt.text(myh, myv, item, ha='center', fontsize=9)
myh = myh + 1 # move one unit to the right for each category
plt.show()
这是电流输出(不正确地显示了观察次数)
您希望按照用于打印stripplot的类别顺序重新索引
nobs
系列相关问题 更多 >
编程相关推荐