如何从dataframe中筛选基于datetime列增加值的行？问题的回答

如何从dataframe中筛选基于datetime列增加值的行？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

<p>我最终编写了一个定制的数据处理器，就我的案例而言，有更多的变量列，如cvxh_len，并且发布的解决方案没有考虑2或3天前的日期，如果在某些情况下，较低的值之间存在较高的值。此外，用NaN替换错误的值比删除行要好。我的解决方案肯定比较慢，但确实有效</p> <pre><code>CheckList=["cvxh_len"] #Can add as many variables as needed #If this is not the case we have to remove the row for i in list(df['filename'].unique()): #for every unique filename df2 = pd.DataFrame() #We create a new df for index, row in df.iterrows(): #We need to fill this df with the other if row["filename"] == i: #Find all filenames that match unique row["index"]=index df2 = pd.concat([df2, row.to_frame().T], ignore_index=True) #Add series to dataframe df2.sort_values(by = 'date') for idx, r in df2.iterrows(): #For every item in new df iterate for M in list(range(len(df2))): #To check earlier dates we need to find length for h in CheckList: #Variables to check if int(idx-M) in list(range(len(df2))): #Check if the item exists we are checking try: if int(df2.loc[[idx]][h]) < int(df2.loc[[idx-int(M)]][h]): #If value was lower on earlier timepoint df.loc[df2.loc[[idx]]["index"], h]=np.nan #We have to replace it with NaN except ValueError: #We need except statement because pass #Some values might be NaN beforehand and can not be subtracted print(df) filename cvxh_len date 0 118_3.JPG 100.0 2018-12-14 1 118_3.JPG 200.0 2018-12-15 2 118_3.JPG 3000.0 2018-12-16 3 118_3.JPG NaN 2018-12-17 4 118_3.JPG NaN 2018-12-18 5 15_7.JPG 200.0 2018-12-14 6 15_7.JPG 400.0 2018-12-15 7 15_7.JPG NaN 2018-12-16 8 15_7.JPG NaN 2018-12-17 9 15_7.JPG NaN 2018-12-18 10 203_4.JPG 5000.0 2018-12-14 11 203_4.JPG 6000.0 2018-12-15 12 203_4.JPG 9000.0 2018-12-16 13 203_4.JPG 11000.0 2018-12-17 14 203_4.JPG 15000.0 2018-12-18 </code></pre>

如何从dataframe中筛选基于datetime列增加值的行？

1 个回答

相关Python问题