我正在处理一个大的气象数据集,我试图找到五个不同参数的最大值和最小值。它们是降水量、降雪量、平均温度、最高温度和最低温度。我的目标是能够找到其中每一个的最高值和最低值,并打印出数字和它发生的日期,但是我有一个嘈杂的数据集和许多空白或零值。我已经成功地将空白值设为非类型,但我不想把它们包含在任何我的计算中。
目前,我一直在尝试获得最大值的多个输出:
def High(value):
if value == row['PRCP']:
maxPrec = max(row['PRCP'] for row in reader)
print(row['DATE'])
return maxPrec
elif value == row['SNOW']:
maxSnow = max(row['SNOW'] for row in reader)
print(row['DATE'])
return maxSnow
elif value == row['TAVG']:
maxTavg = max(row['TAVG'] for row in reader)
return maxTavg
elif value == row['TMIN']:
maxTmin = max(row['TMIN'] for row in reader)
print(row['DATE'])
return maxTmin
elif value == row['TMAX']:
maxTmax = max(row['TMAX'] for row in reader)
return maxTmax
print(High(row['PRCP']))
如果我对另一个变量再次使用我的函数,我会得到一个无效的类型。我也不知道如何打印相关日期,如果你能帮助我与这些问题之一,将不胜感激
文件输出的示例如下:
"STATION","NAME","DATE","PRCP","SNOW","SNWD","TAVG","TMAX","TMIN"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-01",,,,"46","56","35"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-02",,,,"52","59","45"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-03","0.01",,,"58","66","50"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-04","0.00",,,"62","71","53"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-05","0.00",,,"63","75","51"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-06","0.00",,,"73","82","63"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-07","0.00",,,"78","85","70"
"USW00003894","CLARKSVILLE OUTLAW AIRPORT, TN US","2001-04-08","0.00",,,"75","83","67"
我很难找到任何好的资料来帮助我进行大数据集解析,所以如果你对网站有什么建议,可以引导我朝着正确的方向前进,那也会很有帮助
提前谢谢
您可以从csv创建一个数据帧,使数据更易于处理
相关问题 更多 >
编程相关推荐