如何使用函数从宽数据帧创建多个子图

2024-09-27 00:18:35 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据帧df,它有4个唯一的UID-1001100210031004

我想在python中编写一个user-defined function,它执行以下操作:

  1. 生长曲线-针对每个唯一的UID绘制TurbidityTime的曲线图Turbidity值是Time_1Time_2Time_3Time_4&Time_5列。例如,UID = 1003将在每个图形上有4个绘图

enter image description here

  1. 向每个图形添加图例,例如M+LF+LM+RF+R(从列GenType

  2. 为每个图表添加标题。例如-UID:1003 + Site:FRX

  3. 将图形导出为pdfjpegtiff文件-每页4个图形

# The dataset 
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import numpy as np
df= {
    'Gen':['M','M','M','M','F','F','F','F','M','M','M','M','F','F','F','F'],
    'Site':['FRX','FRX','FRX','FRX','FRX','FRX','FRX','FRX','FRX','FRX','FRX','FRX','FRX','FRX','FRX','FRX'],
    'Type':['L','L','L','L','L','L','L','L','R','R','R','R','R','R','R','R'],
     'UID':[1001,1002,1003,1004,1001,1002,1003,1004,1001,1002,1003,1004,1001,1002,1003,1004],
    'Time1':[100.78,112.34,108.52,139.19,149.02,177.77,79.18,89.10,106.78,102.34,128.52,119.19,129.02,147.77,169.18,170.11],
    'Time2':[150.78,162.34,188.53,197.69,208.07,217.76,229.48,139.51,146.87,182.54,189.57,199.97,229.28,244.73,269.91,249.19],
     'Time3':[250.78,262.34,288.53,297.69,308.07,317.7,329.81,339.15,346.87,382.54,369.59,399.97,329.28,347.73,369.91,349.12],
     'Time4':[240.18,232.14,258.53,276.69,338.07,307.74,359.16,339.25,365.87,392.48,399.97,410.75,429.08,448.39,465.15,469.33],
     'Time5':[270.84,282.14,298.53,306.69,318.73,327.47,369.63,389.59,398.75,432.18,449.78,473.55,494.85,509.39,515.52,539.23]
}
df = pd.DataFrame(df,columns = ['Gen','Site','Type','UID','Time1','Time2','Time3','Time4','Time5'])
df

我的尝试

# See below for my thoughts/attempt- I am open to other python libraries and approaches

def graph2pdf(inputdata):
  #1. convert from wide to long
    inputdata = pd.melt(df,id_vars = ['Gen','Type','UID'],var_name = 'Time',value_name = 'Turbidity')
  #
    cmaps = ['Reds', 'Blues', 'Greens', 'Greys','Yellows']
    label_patches = []
    for i, cmap in enumerate(cmaps):
           # I want a growth curve not a distribution curve
           sns.kdeplot(x = Time, y = Turbidity,data = data, cmap=cmaps[i]+'_d')
           label_patch = mpatches.Patch(color=sns.color_palette(cmaps[i])[2],label=label)
           label_patches.append(label_patch)
    #2. add legend
    plt.legend(handles=label_patches, loc='upper left')
    #3. add title- 'UID number+ SiteName: FRX' to each of the graphs
    plt.title('UID:1003+FRX')
    plt.show()
    #4. export as pdf file i.e 4 graphs per page
    with PdfPages('turbidityvstime_pdf.pdf') as pdf:
         plt.figure(figsize=(2,2)) # 4 graphs per page, I am anticipating more pages in the future
    
         pdf.savefig()  # saves the current figure into a pdf page
         plt.close()

# testing the user-defined function   
graph2pdf(df)

我希望图形看起来像下图(turbidity而不是y-axis上的densityx-axis上的time)。如果可能,首选白色或透明背景

谢谢

enter image description here


Tags: import图形dfuidpdftimeastype
1条回答
网友
1楼 · 发布于 2024-09-27 00:18:35
  • I线图通常不适用于离散数据,因为线的斜率可能暗示不存在的趋势。
    • 这是离散的,因为测量是在时间上的离散时刻进行的,而不是一个连续的时间序列
    • 离散数据最好用条形图显示
  • 使用seaborn图形级别的方法,如^{}^{}创建具有四个子地块的图形
  • python 3.8.11pandas 1.3.2matplotlib 3.4.3seaborn 0.11.2
import pandas as pd
import seaborn as sns

def graph2pdf(df):
    # melt the dataframe; any column not a var or value, should be in id_vars
    data = df.melt(id_vars=df.columns[:4], var_name='Time', value_name='Turbidity')
    
    # combine Gen and Type to create label, which can be used for hue
    data['label'] = data.Gen + '-' + data.Type
    
    # plot a catplot for bars
    p1 = sns.catplot(data=data, kind='bar', x='Time', y='Turbidity', hue='label', col='UID', col_wrap=2, height=3.25)
    p1.fig.subplots_adjust(top=0.9) # adjust the figure
    p1.fig.suptitle('UID:1003+FRX')
    p1.savefig("barplots.png")

    # plot a relplot for lines
    p2 = sns.relplot(data=data, kind='line', x='Time', y='Turbidity', hue='label', col='UID', col_wrap=2, height=3.25, marker='o')
    p2.fig.subplots_adjust(top=0.9)
    p2.fig.suptitle('UID:1003+FRX')
    p2.savefig("lineplots.png")
    

graph2pdf(df)

enter image description here

enter image description here

相关问题 更多 >

    热门问题