从pandas的数据中替换/删除某些文本?

2024-09-29 02:27:46 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个问题,熊猫不能正确地替换某些文本。。。

# Create blank column
csvdata["CTemp"] = ""
# Create a copy of the data in "CDPure"
dcol = csvdata.CDPure
# Fill "CTemp" with the data from "CDPure" and replace and/or remove certain parts
csvdata['CTemp'] = dcol.str.replace(" (AMI)", "").replace(" N/A", "Non")

但当我打印时,它并没有通过运行print csvdata[-50:].head(50)来替换下面所示的任何内容

^{pr2}$

注意:CSV相当大,所以我必须使用pandas.set_option('display.max_columns', 250)来打印上面的内容。

有人知道我怎样才能让它正确地替换熊猫身上的那些零件吗?

我已经试过了

CSV示例:

No,CDPure,Blank
1,Data Test,
2,Test N/A,
3,Data N/A,
4,Test Data,
5,Bla,
5,Stack,
6,Over (AMI),
7,Flow (AMI),
8,Test (AMI),
9,Data,
10,Ryflex (AMI),

示例代码:

# Import pandas
import pandas

# Open csv (I have to keep it all as dtype object otherwise I can't do the rest of my script)
csvdata = pandas.read_csv('test.csv', dtype=object)

# Create blank column
csvdata["CTemp"] = ""
# Create a copy of the data in "CDPure"
dcol = csvdata.CDPure
# Fill "CTemp" with the data from "CDPure" and replace and/or remove certain parts
csvdata['CTemp'] = dcol.str.replace(" (AMI)", "").str.replace(" N/A", " Non")

# Print
print csvdata.head(11)

输出:

    No        CDPure Blank         CTemp
0    1     Data Test   NaN     Data Test
1    2      Test N/A   NaN      Test Non
2    3      Data N/A   NaN      Data Non
3    4     Test Data   NaN     Test Data
4    5           Bla   NaN           Bla
5    5         Stack   NaN         Stack
6    6    Over (AMI)   NaN    Over (AMI)
7    7    Flow (AMI)   NaN    Flow (AMI)
8    8    Test (AMI)   NaN    Test (AMI)
9    9          Data   NaN          Data
10  10  Ryflex (AMI)   NaN  Ryflex (AMI)


Tags: andthetestpandasdatacreatenanreplace
1条回答
网友
1楼 · 发布于 2024-09-29 02:27:46

str.replace将其参数解释为正则表达式,因此需要使用dcol.str.replace(r" \(AMI\)", "").str.replace(" N/A", "Non")转义括号。在

这似乎没有得到充分的记录;the docs提到split和{}“也采用正则表达式”,但没有明确表示它们总是将其参数解释为正则表达式。在

相关问题 更多 >