我正在将一组R可视化转换为Python。我有以下目标R多重曲线直方图:
使用Matplotlib和Seaborn的组合,在一个StackOverflow成员的帮助下(参见链接:Python Seaborn Distplot Y value corresponding to a given X value),我能够创建以下Python图:
我对它的外观很满意,除了,我不知道如何将标题信息放入绘图中。下面是我创建Python图表的Python代码
""" Program to draw the sampling histogram distributions """
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
import seaborn as sns
def main():
""" Main routine for the sampling histogram program """
sns.set_style('whitegrid')
markers_list = ["s", "o", "*", "^", "+"]
# create the data dataframe as df_orig
df_orig = pd.read_csv('lab_samples.csv')
df_orig = df_orig.loc[df_orig.hra != -9999]
hra_list_unique = df_orig.hra.unique().tolist()
# create and subset df_hra_colors to match the actual hra colors in df_orig
df_hra_colors = pd.read_csv('hra_lookup.csv')
df_hra_colors['hex'] = np.vectorize(rgb_to_hex)(df_hra_colors['red'], df_hra_colors['green'], df_hra_colors['blue'])
df_hra_colors.drop(labels=['red', 'green', 'blue'], axis=1, inplace=True)
df_hra_colors = df_hra_colors.loc[df_hra_colors['hra'].isin(hra_list_unique)]
# hard coding the current_component to pc1 here, we will extend it by looping
# through the list of components
current_component = 'pc1'
num_tests = 5
df_columns = df_orig.columns.tolist()
start_index = 5
for test in range(num_tests):
current_tests_list = df_columns[start_index:(start_index + num_tests)]
# now create the sns distplots for each HRA color and overlay the tests
i = 1
for _, row in df_hra_colors.iterrows():
plt.subplot(3, 3, i)
select_columns = ['hra', current_component] + current_tests_list
df_current_color = df_orig.loc[df_orig['hra'] == row['hra'], select_columns]
y_data = df_current_color.loc[df_current_color[current_component] != -9999, current_component]
axs = sns.distplot(y_data, color=row['hex'],
hist_kws={"ec":"k"},
kde_kws={"color": "k", "lw": 0.5})
data_x, data_y = axs.lines[0].get_data()
axs.text(0.0, 1.0, row['hra'], horizontalalignment="left", fontsize='x-small',
verticalalignment="top", transform=axs.transAxes)
for current_test_index, current_test in enumerate(current_tests_list):
# this_x defines the series of current_component(pc1,pc2,rhob) for this test
# indicated by 1, corresponding R program calls this test_vector
x_series = df_current_color.loc[df_current_color[current_test] == 1, current_component].tolist()
for this_x in x_series:
this_y = np.interp(this_x, data_x, data_y)
axs.plot([this_x], [this_y - current_test_index * 0.05],
markers_list[current_test_index], markersize = 3, color='black')
axs.xaxis.label.set_visible(False)
axs.xaxis.set_tick_params(labelsize=4)
axs.yaxis.set_tick_params(labelsize=4)
i = i + 1
start_index = start_index + num_tests
# plt.show()
pp = PdfPages('plots.pdf')
pp.savefig()
pp.close()
def rgb_to_hex(red, green, blue):
"""Return color as #rrggbb for the given color values."""
return '#%02x%02x%02x' % (red, green, blue)
if __name__ == "__main__":
main()
熊猫守则运作良好,它正在做它应该做的。我缺乏在Matplotlib中使用“PdfPages”的知识和经验,这是我的瓶颈。如何在Python/Matplotlib/Seaborn中显示可以在相应的R visalization中显示的头信息。标题信息,我指的是R可视化在柱状图之前的顶部,即‘pc1’、MRP、XRD等等,。。。。在
我可以很容易地从我的程序中得到它们的值,例如,当前的嫘u组件是'pc1'等,但我不知道如何用标题格式化绘图。有人能提供一些指导吗?在
您可能正在查找图形标题或超级标题,^{} :
在您的例子中,您可以很容易地用
^{pr2}$plt.gcf()
得到这个数字,所以请尝试一下头中的其余信息称为legend。 下面让我们假设所有子图都有相同的标记。然后为其中一个子批次创建一个图例就足够了。 要创建图例标签,可以将
label
参数放入绘图,即稍后调用
axs.legend()
时,将自动生成带有相应标签的图例。图例的定位方法详见this answer。在这里,您可能需要根据图形坐标来放置图例,即
相关问题 更多 >
编程相关推荐