如何在不指定中间DF的情况下，通过布尔级数为多个连续回合生成子集？

df = pd.DataFrame({'modSeq': {0: 'DC[+57]QNLKLIPQRGVS[-2]EAVE', 1: 'DGALQPPFQEPIVGRE', 2: 'DIAPR[-43]AK', 3: 'DQLALI[+16]WFAYLE', 4: 'DQLALIWFAYLE', 5: 'EC[+57]YGL[+16]KLIPE', 6: 'EDC[+57]QNLK', 7: 'EDC[+57]QNLKLIPQR'}, 'area': {0: 551, 1: 8374246, 2: 416840, 3: 546654, 4: 293998, 5: 189995, 6: 59548, 7: 26552}, 'comments': {0: 'weird, both jump around', 1: 'unmodified', 2: nan, 3: 'both go up! Problems recalculating; 190122', 4: 'unmodified', 5: nan, 6: 'unmodified', 7: 'unmodified; Problems recalculating; 190122'}})

modSeq area comments 0 DC[+57]QNLKLIPQRGVS[-2]EAVE 551 weird, both jump around 1 DGALQPPFQEPIVGRE 8374246 unmodified 2 DIAPR[-43]AK 416840 NaN 3 DQLALI[+16]WFAYLE 546654 both go up! Problems recalculating; 190122 4 DQLALIWFAYLE 293998 unmodified 5 EC[+57]YGL[+16]KLIPE 189995 NaN 6 EDC[+57]QNLK 59548 unmodified 7 EDC[+57]QNLKLIPQR 26552 unmodified; Problems recalculating; 190122

modSeq area comments 3 DQLALI[+16]WFAYLE 546654 both go up! Problems recalculating; 190122 7 EDC[+57]QNLKLIPQR 26552 unmodified; Problems recalculating; 190122

1条回答

网友

1楼 · 发布于 2024-05-19 16:35:52

在这种特定的情况下，如果问题是nan将字符串访问器弄乱，那么解决方案可以很简单-将nan替换为空字符串：

df[df.comments.fillna('').str.contains('190122')]

一般来说，虽然我自己不喜欢中间变量，但我所关心的问题更具装饰性。如果您想要避免中间变量名的原因是它不美观，method chaining可能是您想要的（如果有其他问题，请让我知道）。在这种特殊情况下，它会更加丑陋，但通常会导致两步代码

(
    df
    .query("~comments.isna()", engine='python')
    .query("comments.str.contains('190122')", engine='python')
)

相关问题更多 >

编程相关推荐

热门问题

热门文章