如何指定要在xaxis上绘制的离散值(matplotlib、boxplot)?

2024-10-01 09:22:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用matplotlib(Python)中的boxplot来创建框图,我创建了许多具有不同日期的图形。在x轴上,数据是离散的。在

x轴上以秒为单位的值为0.25、0.5、1、2、5。。。。28800这些值是任意选择的(它们是采样周期)。在某些图上,由于数据不可用,因此缺少一个或两个值。在这些图上,x轴调整自身大小以分散其他值。在

我希望所有的图形在x轴上的同一个位置上都有相同的值(如果x轴显示一个值并不重要,但图形上没有绘制数据)

有人能告诉我有没有办法指定x轴的值?或者用另一种方法在同一个地方保持相同的值。在

规范相关章节如下:


对我来说,加入myDataframe.groupby(“日期”):

    graphFilename = (basename+'_' + str(i) + '.png')
    plt.figure(graphFilename)
    group.boxplot(by=["SamplePeriod_seconds"], sym='g+') ## colour = 'blue'
    plt.grid(True)
    axes = plt.gca()
    axes.set_ylim([0,30000])
    plt.ylabel('Average distance (m)', fontsize =8)
    plt.xlabel('GPS sample interval (s)', fontsize=8)
    plt.tick_params(axis='x', which='major', labelsize=8)
    plt.tick_params(axis='y', which='major', labelsize=8)
    plt.xticks(rotation=90)
    plt.title(str(i) + ' - ' + 'Average distance travelled by cattle over 24  hour period', fontsize=9) 
    plt.suptitle('')
    plt.savefig(graphFilename)
    plt.close()     

如果有人帮忙,我会继续在谷歌上搜索。谢谢:)


Tags: 数据图形whichbypltparamsdistanceaverage
3条回答

非常感谢大家的帮助,用你们的回答我得到了下面的代码。(我意识到它可能会改进,但很高兴它起作用了,我现在可以查看数据:)

valuesShouldPlot = ['0.25','0.5','1.0','2.0','5.0','10.0','20.0','30.0','60.0','120.0','300.0','600.0','1200.0','1800.0','2400.0','3000.0','3600.0','7200.0','10800.0','14400.0','18000.0','21600.0','25200.0','28800.0']       


for xDate, group in myDataframe.groupby("Date"):            ## for each date

    graphFilename = (basename+'_' + str(xDate) + '.png')    ## make up a suitable filename for the graph

    plt.figure(graphFilename)

    group.boxplot(by=["SamplePeriod_seconds"], sym='g+', return_type='both')  ## create box plot, (boxplots are placed in default positions)

    ## get information on where the boxplots were placed by looking at the values on the x-axis                                                    
    axes = plt.gca()  
    checkXticks= axes.get_xticks()
    numOfValuesPlotted =len(checkXticks)            ## check how many boxplots were actually plotted by counting the labels printed on the x-axis
    lengthValuesShouldPlot = len(valuesShouldPlot)  ## (check how many boxplots should have been created if no data was missing)



    if (numOfValuesPlotted < valuesShouldPlot): ## if number of values actually plotted is less than the maximum possible it means some values are missing
                                                ## if that occurs then want to move the plots across accordingly to leave gaps where the missing values should go


        labels = [item.get_text() for item in axes.get_xticklabels()]

        i=0                 ## counter to increment through the entire list of x values that should exist if no data was missing.
        j=0                 ## counter to increment through the list of x labels that were originally plotted (some labels may be missing, want to check what's missing)

        positionOfBoxesList =[] ## create a list which will eventually contain the positions on the x-axis where boxplots should be drawn  

        while ( j < numOfValuesPlotted): ## look at each value in turn in the list of x-axis labels (on the graph plotted earlier)

            if (labels[j] == valuesShouldPlot[i]):  ## if the value on the x axis matches the value in the list of 'valuesShouldPlot' 
                positionOfBoxesList.append(i)       ## then record that position as a suitable position to put a boxplot
                j = j+1
                i = i+1


            else :                                  ## if they don't match (there must be a value missing) skip the value and look at the next one             

                print("\n******** missing value ************")
                print("Date:"),
                print(xDate),
                print(", Position:"),
                print(i),
                print(":"),
                print(valuesShouldPlot[i])
                i=i+1               


        plt.close()     ## close the original plot (the one that didn't leave gaps for missing data)
        group.boxplot(by=["SamplePeriod_seconds"], sym='g+', return_type='both', positions=positionOfBoxesList) ## replot with boxes in correct positions

    ## format graph to make it look better        
    plt.ylabel('Average distance (m)', fontsize =8)
    plt.xlabel('GPS sample interval (s)', fontsize=8)
    plt.tick_params(axis='x', which='major', labelsize=8)
    plt.tick_params(axis='y', which='major', labelsize=8)
    plt.xticks(rotation=90)   
    plt.title(str(xDate) + ' - ' + 'Average distance travelled by cattle over 24 hour period', fontsize=9) ## put the title above the first subplot (ie. at the top of the page)
    plt.suptitle('')
    axes = plt.gca() 
    axes.set_ylim([0,30000])

    ## save and close 
    plt.savefig(graphFilename)  
    plt.close()         

{cd1>在默认情况下,{cd1>只绘制连续轴上的位置。遗漏的数据被忽略了,仅仅是因为箱线图不知道它们丢失了。但是,可以使用positions参数手动设置框的位置。 下面的示例执行此操作,因此即使缺少值,也会生成相等范围的绘图。在

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd


basename = __file__+"_plot"
Nd = 4 # four different dates
Ns = 5 # five second intervals
N = 80 # each 80 values
date = []
seconds = []
avgdist = []
# fill lists
for i in range(Nd):
    # for each date, select a random SamplePeriod to be not part of the dataframe
    w = np.random.randint(0,5)
    for j in range(Ns):
        if j!=w:
            av = np.random.poisson(1.36+j/10., N)*4000+1000
            avgdist.append(av) 
            seconds.append([j]*N)
            date.append([i]*N)

date = np.array(date).flatten()
seconds = np.array(seconds).flatten()
avgdist = np.array(avgdist).flatten()
#put data into DataFrame
myDataframe = pd.DataFrame({"Date" : date, "SamplePeriod_seconds" : seconds, "avgdist" : avgdist}) 
# obtain a list of all possible Sampleperiods
globalunique = np.sort(myDataframe["SamplePeriod_seconds"].unique())

for i, group in myDataframe.groupby("Date"):

    graphFilename = (basename+'_' + str(i) + '.png')
    fig = plt.figure(graphFilename, figsize=(6,3))
    axes = fig.add_subplot(111)
    plt.grid(True)

    # omit the `dates` column
    dfgroup = group[["SamplePeriod_seconds", "avgdist"]]
    # obtain a list of Sampleperiods for this date
    unique = np.sort(dfgroup["SamplePeriod_seconds"].unique())
    # plot the boxes to the axes, one for each sample periods in dfgroup
    # set the boxes' positions to the values in unique
    dfgroup.boxplot(by=["SamplePeriod_seconds"], sym='g+', positions=unique, ax=axes)

    # set xticks to the unique positions, where boxes are
    axes.set_xticks(unique)
    # make sure all plots share the same extent.
    axes.set_xlim([-0.5,globalunique[-1]+0.5])
    axes.set_ylim([0,30000])

    plt.ylabel('Average distance (m)', fontsize =8)
    plt.xlabel('GPS sample interval (s)', fontsize=8)
    plt.tick_params(axis='x', which='major', labelsize=8)
    plt.tick_params(axis='y', which='major', labelsize=8)
    plt.xticks(rotation=90)
    plt.suptitle(str(i) + ' - ' + 'Average distance travelled by cattle over 24  hour period', fontsize=9) 
    plt.title("")
    plt.savefig(graphFilename)
    plt.close()    

enter image description here
enter image description here

如果SamplePeriod_seconds列中的值不是等间距的,这仍然有效,但是如果它们相差很大,这将不会产生好的结果,因为这些条会重叠:

enter image description here

然而,这并不是绘图本身的问题。为了得到进一步的帮助,我们需要知道你期望图的结尾是什么样子。在

如果你尝试这样做:

plt.xticks(np.arange(x.min(), x.max(), 5))

其中x是x值的数组,5是沿轴的步数。在

这同样适用于带有yticks的y轴。希望有帮助!:)

编辑:

我已经删除了我没有的实例,但是下面的代码应该给你一个网格来绘制:

^{pr2}$

尝试在此基础上插入您的值:)

相关问题 更多 >