更快地遍历df行

ctqparam = [] wwy = [] ww = [] for index, row in df.iterrows(): date = str(row['Event_Start_Time']) day = int(date[8] + date[9]) month = int(date[5] + date[6]) total = 0 for i in range(0, month-1): total += months[i] total += day out = total // 7 ww += [out] wwy += [str(date[0] + date[1] + date[2] + date[3])] val = str(row['TPRev']) out = "" for letter in val: if letter != '.': out += letter df.replace(to_replace=row['TPRev'], value=str(out), inplace = True) val = str(row['Subtest']) if val in ctqparam_dict.keys(): ctqparam += [ctqparam_dict[val]] # add WWY column, WW column, and correct data format of Test_Tape column df.insert(0, column='Work_Week_Year', value = wwy) df.insert(3, column='Work_Week', value = ww) df.insert(4, column='ctqparam', value = ctqparam)

1条回答

网友

1楼 · 发布于 2024-10-02 12:27:25

很难说你到底想做什么。然而，如果您在各行之间循环，那么很有可能有更好的方法

例如，给定一个如下所示的csv文件

Event_Start_Time,TPRev,Subtest
4/12/19 06:00,"this. string. has dots.. in it.",{'A_Dict':'maybe?'}
6/10/19 04:27,"another stri.ng wi.th d.ots.",{'A_Dict':'aVal'}

您可能希望：

将Event_Start_Time格式化为日期时间
从Event_Start_Time获取周数
从列^{中的字符串中删除所有点（.）
将Subtest中包含的词典展开到它自己的列

不循环遍历行，考虑按列进行操作。就像对列的第一个“单元格”进行复制一样

代码：

import pandas as pd

df = pd.read_csv('data.csv')

print(df)

     Event_Start_Time    TPRev                              Subtest
0    4/12/19 06:00       this. string. has dots.. in it.    {'A_Dict':'maybe?'}
1    6/10/19 04:27       another stri.ng wi.th d.ots.       {'A_Dict':'aVal'}


# format 'Event_Start_Time' as as datetime
df['Event_Start_Time'] = pd.to_datetime(df['Event_Start_Time'], format='%d/%m/%y %H:%M')

# get the week number from 'Event_Start_Time'
df['Week_Number'] = df['Event_Start_Time'].dt.isocalendar().week

# replace all '.' (periods) in the 'TPRev' column
df['TPRev'] = df['TPRev'].str.replace('.', '', regex=False)

# get a dictionary string out of column 'Subtest' and put into a new column
df = pd.concat([df.drop(['Subtest'], axis=1), df['Subtest'].map(eval).apply(pd.Series)], axis=1)

print(df)

     Event_Start_Time      TPRev                       Week_Number    A_Dict
0    2019-12-04 06:00:00   this string has dots in it  49             maybe?
1    2019-10-06 04:27:00   another string with dots    40             aVal


print(df.info())

Data columns (total 4 columns):
 #   Column            Non-Null Count  Dtype         
 -                            -         
 0   Event_Start_Time  2 non-null      datetime64[ns]
 1   TPRev             2 non-null      object        
 2   Week_Number       2 non-null      UInt32        
 3   A_Dict            2 non-null      object        
dtypes: UInt32(1), datetime64[ns](1), object(2)

所以你会得到这样一个数据帧

     Event_Start_Time      TPRev                       Week_Number    A_Dict
0    2019-12-04 06:00:00   this string has dots in it  49             maybe?
1    2019-10-06 04:27:00   another string with dots    40             aVa

显然，你可能会想做其他事情。看看你的数据。列出你想对每一列做什么，或者你需要什么新的列。不要说现在有多大的可能性，以前也做过——你只需要找到现有的方法

您可以写下从当前行和下面的行中获取天数差。最后搜索如何进行所需的格式化或计算。把问题分解

相关问题更多 >

编程相关推荐

热门问题

热门文章