使用循环根据特定列值标识dataframe中的所有实例

2024-09-30 16:34:36 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下数据帧

teamId  matchId matchPeriod eventSec    eventId eventName
190 8516    5237840 1H  721.2   5   Interruption
191 8516    5237840 1H  723.4   3   Free Kick
192 8516    5237840 1H  725.7   8   Pass
193 8516    5237840 1H  727.2   8   Pass
194 8516    5237840 1H  728.5   10  Shot

这种情况持续了大约1000行

我想识别“Shot”的所有实例,然后切掉该行和前面的4行,创建一个序列,以便处理数据

有人能帮忙吗


Tags: 数据实例free情况序列passshoteventname
2条回答

请尝试以下代码: dta#您的数据帧

index = dta[dta['eventName'] == 'Shot'].index

result = []
for i in range(5):
    result = result + list(index - i)

result = set(result)

sub = dta[dta.index.isin(result)]

首先,它选择值为'Shot'的行的索引作为其列'eventName'。然后,我们创建一个集合并迭代操作,以获得所选行之前的4行

最后,我们将选择收集索引的行

似乎要在显示“Shot”的前四行中进行切片。您可以使用索引值查找“Shot”出现的位置,然后根据索引值对数据帧进行切片

将数据添加到数据帧:

import pandas as pd
from tabulate import tabulate

dict = {
    "teamid": [190,191,192,108,190,190,191,192,108,190,190,191,192,108,190,190,191,192,108,190],
    "eventId": [5,2,4,5,6,5,2,4,5,6,5,2,4,5,6,5,2,4,5,6],
    "eventname": ['hello','Free Kick','Pass','Pass','Shot','Interruption','Free Kick','Pass','Pass','Shot','Interruption','Free Kick','Pass','Pass','Shot','Interruption','Free Kick','Pass','Pass','Shot']
}
df=pd.DataFrame(data=dict)
print(tabulate(df, headers = 'keys', tablefmt = 'psql'))

然后切片数据并执行任务

# Search for index values where "Shot" appear.
index_values = df[df['eventname'] == 'Shot'].index
# Add -1 at 0 index in index_value list
index_values = index_values.insert(0,-1)
#Slide the data. Over here you can perform your task on last four rows
for i in range(0,len(index_values)-1):
    # perform your task here
    print(tabulate(df[index_values[i]+1:index_values[i+1]], headers='keys', tablefmt='psql'))

相关问题 更多 >