Pandas检查相邻列的值

2024-10-06 13:35:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个跟踪问题状态的测向仪。从“打开”、“进行中”到“关闭”,如下所示:

        T1          T2           T3     T4      T5 
1      Open        In Progress Closed
2      In Progress Closed
3      Open        In Progress Open    Closed
4      Open        In Progress Closed  Open   Closed
5      Open        In Progress Closed

基本上我想找到所有重新讨论的问题。这可以由具有Closed值并具有后续转换的任何行注意到。例如,索引4T3中有一个闭合值,但是{}包含一些内容来指示它已被重新打开。在

输出将是:

^{pr2}$

在实际的df中,列的范围从T1到T25,有50k行。在

所以基本上我需要检查每一列,找出closed是否存在,然后检查下一列是否不是空的。在

谢谢


Tags: in内容df状态opent1progresst3
1条回答
网友
1楼 · 发布于 2024-10-06 13:35:11

我认为需要:

df['Reopened'] = ((df == 'Open') & ((df.shift(axis=1)) == 'Closed')).any(axis=1).astype(int)
print (df)
            T1           T2      T3      T4      T5  Reopened
1         Open  In Progress  Closed     NaN     NaN         0
2  In Progress       Closed     NaN     NaN     NaN         0
3         Open  In Progress    Open  Closed     NaN         0
4         Open  In Progress  Closed    Open  Closed         1
5         Open  In Progress  Closed     NaN     NaN         0

细节

检查每个Open值:

^{pr2}$

使用移位数据帧检查Closed

print (df.shift(axis=1))
    T1           T2           T3      T4      T5
1  NaN         Open  In Progress  Closed     NaN
2  NaN  In Progress       Closed     NaN     NaN
3  NaN         Open  In Progress    Open  Closed
4  NaN         Open  In Progress  Closed    Open
5  NaN         Open  In Progress  Closed     NaN

print ((df.shift(axis=1)) == 'Closed')
      T1     T2     T3     T4     T5
1  False  False  False   True  False
2  False  False   True  False  False
3  False  False  False  False   True
4  False  False  False   True  False
5  False  False  False   True  False

然后通过&链接到AND,并通过^{}获得每行至少一个True

print (((df == 'Open') & ((df.shift(axis=1)) == 'Closed')))
      T1     T2     T3     T4     T5
1  False  False  False  False  False
2  False  False  False  False  False
3  False  False  False  False  False
4  False  False  False   True  False
5  False  False  False  False  False

print (((df == 'Open') & ((df.shift(axis=1)) == 'Closed')).any(axis=1))
1    False
2    False
3    False
4     True
5    False
dtype: bool

最后通过astype将布尔掩码转换为整数,并分配给新列:

df['Reopened'] = ((df == 'Open') & ((df.shift(axis=1)) == 'Closed')).any(axis=1).astype(int)
print (df)
            T1           T2      T3      T4      T5  Reopened
1         Open  In Progress  Closed     NaN     NaN         0
2  In Progress       Closed     NaN     NaN     NaN         0
3         Open  In Progress    Open  Closed     NaN         0
4         Open  In Progress  Closed    Open  Closed         1
5         Open  In Progress  Closed     NaN     NaN         0

相关问题 更多 >