我有一个问题,熊猫不能正确地替换某些文本。。。
# Create blank column
csvdata["CTemp"] = ""
# Create a copy of the data in "CDPure"
dcol = csvdata.CDPure
# Fill "CTemp" with the data from "CDPure" and replace and/or remove certain parts
csvdata['CTemp'] = dcol.str.replace(" (AMI)", "").replace(" N/A", "Non")
但当我打印时,它并没有通过运行print csvdata[-50:].head(50)
来替换下面所示的任何内容
注意:CSV相当大,所以我必须使用pandas.set_option('display.max_columns', 250)
来打印上面的内容。
有人知道我怎样才能让它正确地替换熊猫身上的那些零件吗?
CSV示例:
No,CDPure,Blank
1,Data Test,
2,Test N/A,
3,Data N/A,
4,Test Data,
5,Bla,
5,Stack,
6,Over (AMI),
7,Flow (AMI),
8,Test (AMI),
9,Data,
10,Ryflex (AMI),
示例代码:
# Import pandas
import pandas
# Open csv (I have to keep it all as dtype object otherwise I can't do the rest of my script)
csvdata = pandas.read_csv('test.csv', dtype=object)
# Create blank column
csvdata["CTemp"] = ""
# Create a copy of the data in "CDPure"
dcol = csvdata.CDPure
# Fill "CTemp" with the data from "CDPure" and replace and/or remove certain parts
csvdata['CTemp'] = dcol.str.replace(" (AMI)", "").str.replace(" N/A", " Non")
# Print
print csvdata.head(11)
输出:
No CDPure Blank CTemp
0 1 Data Test NaN Data Test
1 2 Test N/A NaN Test Non
2 3 Data N/A NaN Data Non
3 4 Test Data NaN Test Data
4 5 Bla NaN Bla
5 5 Stack NaN Stack
6 6 Over (AMI) NaN Over (AMI)
7 7 Flow (AMI) NaN Flow (AMI)
8 8 Test (AMI) NaN Test (AMI)
9 9 Data NaN Data
10 10 Ryflex (AMI) NaN Ryflex (AMI)
str.replace
将其参数解释为正则表达式,因此需要使用dcol.str.replace(r" \(AMI\)", "").str.replace(" N/A", "Non")
转义括号。在这似乎没有得到充分的记录;the docs提到}“也采用正则表达式”,但没有明确表示它们总是将其参数解释为正则表达式。在
split
和{相关问题 更多 >
编程相关推荐