如何在数据帧的列中找到最大值?

2024-06-26 12:53:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据帧,看起来像:

Date                           Location          NO2
2017-11-24 23:00:00             toronto          0.038
2017-11-24 22:00:00             toronto          0.031
2017-11-24 21:00:00             toronto          0.025
2017-11-24 20:00:00             toronto          0.033
2017-11-24 19:00:00             toronto          0.026
2017-11-24 18:00:00             toronto          0.021
2017-11-24 17:00:00             toronto          0.017

每周24小时每天记录。我怎样才能找到NO2在这段时间内的最高值?你知道吗


Tags: 数据date记录location小时torontono2
3条回答

试试这个:

df.iloc[np.argmax(df.NO2),:]

您可以使用np.where()

导入数据:

import sys
if sys.version_info[0] < 3: 
    from StringIO import StringIO
else:
    from io import StringIO

data = StringIO('''Date,Location,NO2
2017-11-24 23:00:00,toronto,0.038
2017-11-24 22:00:00,toronto,0.031
2017-11-24 21:00:00,toronto,0.025
2017-11-24 20:00:00,toronto,0.033
2017-11-24 19:00:00,toronto,0.026
2017-11-24 18:00:00,toronto,0.021
2017-11-24 17:00:00,toronto,0.017''')

df = pd.read_csv(data, sep=',')

使用np.where()查找与max NO2值匹配的行的索引:

max_time = df.loc[np.where(df.NO2.values == df.NO2.max())[0], 'Date'].values[0]
max_time = df.loc[np.where(df.NO2.values == df.NO2.max())[0], 'Date'].values[0]
print('Max time:',max_time)
print('Max NO2:',df.NO2.max())
Max time: 2017-11-24 23:00:00
Max NO2: 0.038

您可以使用DatetimeIndex创建时间序列,对于date by maximum NO使用^{},对于maximum value使用max

s = df.set_index('Date')['NO2']

print (s.idxmax())
2017-11-24 23:00:00

print (s.max())
0.038

如果需要每天的最长日期:

print (df)
                 Date Location    NO2
0 2017-11-24 23:00:00  toronto  0.038
1 2017-11-24 22:00:00  toronto  0.031
2 2017-11-24 21:00:00  toronto  0.025
3 2017-11-25 20:00:00  toronto  0.033
4 2017-11-25 19:00:00  toronto  0.026
5 2017-11-26 18:00:00  toronto  0.021
6 2017-11-26 17:00:00  toronto  0.017

df1 = df.set_index('Date').groupby(pd.Grouper(freq='24H'))['NO2'].idxmax().reset_index()
print (df1)
        Date                 NO2
0 2017-11-24 2017-11-24 23:00:00
1 2017-11-25 2017-11-25 20:00:00
2 2017-11-26 2017-11-26 18:00:00

df2 = (df.set_index('Date')
         .groupby(pd.Grouper(freq='24H'))['NO2']
         .agg([('maxdate','idxmax'),('maxval','max')]))
print (df2)
                       maxdate  maxval
Date                                  
2017-11-24 2017-11-24 23:00:00   0.038
2017-11-25 2017-11-25 20:00:00   0.033
2017-11-26 2017-11-26 18:00:00   0.021

或者如果需要最长时间:

print (df)
                 Date Location    NO2
0 2017-11-24 23:00:00  toronto  0.038
1 2017-11-24 22:00:00  toronto  0.031
2 2017-11-24 21:00:00  toronto  0.025
3 2017-11-25 20:00:00  toronto  0.033
4 2017-11-25 21:00:00  toronto  0.026
5 2017-11-26 21:00:00  toronto  0.021
6 2017-11-26 22:00:00  toronto  0.017

s = (df.groupby(df['Date'].dt.time)['NO2'].mean())
print (s)
Date
20:00:00    0.033
21:00:00    0.024
22:00:00    0.024
23:00:00    0.038
Name: NO2, dtype: float64

print (s.idxmax())
23:00:00

print (s.max())
0.038

相关问题 更多 >