如何在Python3中打印数据集中的最大值和最小值以及相关日期?

2024-09-28 22:23:45 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在处理一个大的气象数据集,我试图找到五个不同参数的最大值和最小值。它们是降水量、降雪量、平均温度、最高温度和最低温度。我的目标是能够找到其中每一个的最高值和最低值,并打印出数字和它发生的日期,但是我有一个嘈杂的数据集和许多空白或零值。我已经成功地将空白值设为非类型,但我不想把它们包含在任何我的计算中。

目前,我一直在尝试获得最大值的多个输出:

def High(value):
                if value == row['PRCP']:
                    maxPrec = max(row['PRCP'] for row in reader)
                    print(row['DATE'])
                    return maxPrec
                elif value == row['SNOW']:
                    maxSnow = max(row['SNOW'] for row in reader)
                    print(row['DATE'])
                    return maxSnow
                elif value == row['TAVG']:
                    maxTavg = max(row['TAVG'] for row in reader)
                    return maxTavg
                elif value == row['TMIN']:
                    maxTmin = max(row['TMIN'] for row in reader)
                    print(row['DATE'])
                    return maxTmin
                elif value == row['TMAX']:
                    maxTmax = max(row['TMAX'] for row in reader)
                    return maxTmax

    print(High(row['PRCP']))

如果我对另一个变量再次使用我的函数,我会得到一个无效的类型。我也不知道如何打印相关日期,如果你能帮助我与这些问题之一,将不胜感激

文件输出的示例如下:

"STATION","NAME","DATE","PRCP","SNOW","SNWD","TAVG","TMAX","TMIN"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-01",,,,"46","56","35"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-02",,,,"52","59","45"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-03","0.01",,,"58","66","50"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-04","0.00",,,"62","71","53"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-05","0.00",,,"63","75","51"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-06","0.00",,,"73","82","63"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-07","0.00",,,"78","85","70"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-08","0.00",,,"75","83","67"

我很难找到任何好的资料来帮助我进行大数据集解析,所以如果你对网站有什么建议,可以引导我朝着正确的方向前进,那也会很有帮助

提前谢谢


Tags: inforreturnvaluemaxreadertnrow
1条回答
网友
1楼 · 发布于 2024-09-28 22:23:45

您可以从csv创建一个数据帧,使数据更易于处理

import pandas as pd
data = pd.read_csv('path_to_your_filename')
max_precip_index = data['PRCP'].idxmax()  # Get index of max value
min_precip_index = data['PRCP'].idxmin()  # Get index of min value
print (data.iloc[max_precip_index][['PRCP', 'DATE']])  # Print prcp and date value at the row corresponding to the found index.

相关问题 更多 >