如果一列中的文本包含特定的字符串模式，那么如何创建新列？

+-------+----------------------------+-------------------+-----------------------+ | Index | 0 | 1 | 2 | +-------+----------------------------+-------------------+-----------------------+ | 0 | Reference Curr | Daybook / Voucher | Invoice Date Due Date | | 1 | V50011 Tech Comp | nan | Phone:0177222222 | | 2 | Regis Place | nan | Fax:017757575789 | | 3 | Catenberry | nan | nan | | 4 | Manhattan, NY | nan | nan | | 5 | V7484 Pipe | nan | Phone: | | 6 | Japan | nan | nan | | 7 | nan | nan | nan | | 8 | 4543.34GBP (British Pound) | nan | nan | +-------+----------------------------+-------------------+-----------------------+

+-------+----------------------------+-------------------+-----------------------+------------+ | Index | 0 | 1 | 2 | Company | +-------+----------------------------+-------------------+-----------------------+------------+ | 0 | Reference Curr | Daybook / Voucher | Invoice Date Due Date | nan | | 1 | V50011 Tech | nan | Phone:0177222222 |V50011 Tech | | 2 | Regis Place | nan | Fax:017757575789 | nan | | 3 | Catenberry | nan | nan | nan | | 4 | Manhattan, NY | nan | nan | nan | | 5 | V7484 Pipe | nan | Phone: | V7484 Pipe | | 6 | Japan | nan | nan | nan | | 7 | nan | nan | nan | nan | | 8 | 4543.34GBP (British Pound) | nan | nan | nan | +-------+----------------------------+-------------------+-----------------------+------------+

3条回答

网友

1楼 · 编辑于 2024-09-29 22:18:31

一个潜在的解决方案是使用列表理解。你可能会得到一个速度提升使用熊猫的一些内置功能，但这将使你达到那里

#!/usr/bin/env python

import numpy as np
import pandas as pd

df = pd.DataFrame({
    0:["reference", "v5001 tech comp", "catenberry", "very different"],
    1:["not", "phone", "other", "text"]
    })

df["new_column"] = [x  if (x[0].lower() == "v") & ("phone" in y.lower())
  else np.nan for x,y in df.loc[:, [0,1]].values]

print(df)

那会产生什么

                 0      1       new_column
0        reference    not              NaN
1  v5001 tech comp  phone  v5001 tech comp
2       catenberry  other              NaN
3   very different   text              NaN

我所做的就是接受你的两个条件，建立一个新的列表，然后分配给你的新专栏

网友

2楼 · 编辑于 2024-09-29 22:18:31

您没有为我们提供一种简单的方法来测试潜在的解决方案，但这应该可以完成这项工作：

df.loc[df[0].str.startswith('V') & df[2].str.contains('Phone'), 'Company'] = df[0]

网友

3楼 · 编辑于 2024-09-29 22:18:31

这是另一种获得结果的方法

condition1=df['0'].str.startswith('V')
condition2=df['2'].str.contains('Phone')

df['Company']=np.where((condition1 & condition2), df['0'],np.nan)
df['Company']=df['Company'].str.split(' ',expand=True)

相关问题更多 >

编程相关推荐

热门问题

热门文章