熊猫使用tldex将最后2个逗号分隔的项目合并到单元格中

#First 5 rows for testing purposes df = pd.DataFrame(request['destinationhostname'].iloc[0:5]) destinationhostname 0 pod51042psh.outlook.com 1 s.mrmserve.com 2 client-office365-tas.msedge.net 3 otf.msn.com 4 log.pinterest.com #Applying tld extract on destinationhostname column df['req'] = request.destinationhostname.apply(tldextract.extract) destinationhostname req 0 pod51042psh.outlook.com (pod51042psh, outlook, com) 1 s.mrmserve.com (s, mrmserve, com) 2 client-office365-tas.msedge.net (client-office365-tas, msedge, net) 3 otf.msn.com (otf, msn, com) 4 log.pinterest.com (log, pinterest, com)

destinationhostname req fld 0 pod51042psh.outlook.com (pod51042psh, outlook, com) outlook.com 1 s.mrmserve.com (s, mrmserve, com) mrmserve.com 2 client-office365-tas.msedge.net (client-office365-tas, msedge, net) msedge.net 3 otf.msn.com (otf, msn, com) msn.com 4 log.pinterest.com (log, pinterest, com) pinterest.com

1条回答

网友

1楼 · 发布于 2024-09-29 19:20:27

切片str对象，然后join

df['fld'] = df.req.str[1:].str.join('.')

df

               destinationhostname                                  req            fld
0          pod51042psh.outlook.com          (pod51042psh, outlook, com)    outlook.com
1                   s.mrmserve.com                   (s, mrmserve, com)   mrmserve.com
2  client-office365-tas.msedge.net  (client-office365-tas, msedge, net)     msedge.net
3                      otf.msn.com                      (otf, msn, com)        msn.com
4                log.pinterest.com                (log, pinterest, com)  pinterest.com

或者作为@coldspeed has shown，可以使用数组结尾引用进行切片

df['fld'] = df.req.str[-2:].str.join('.')

相关问题更多 >

编程相关推荐

热门问题

热门文章