删除使用正则表达式的单词不能正常工作的数字

2024-09-30 14:18:09 发布

您现在位置:Python中文网/ 问答频道 /正文

import re
text = """Why is this $[...] when the same product is available for $[...] here?<br />
http://www.amazon.com/VICTOR-FLY-MAGNET-BAIT-REFILL/dp/B00004RBDY<br /><br />
The Victor M380 and M502 traps are unreal, of course -- total fly genocide. 
Pretty stinky, but only right nearby. won't, can't iamwordwith4number 234f  ther was a word withnumber before me"""

sentense1 = re.sub(r"\S*\d+\S*", "", text)  # removes words which has digits in it.
sentense1 = re.sub('[^A-Za-z0-9]+', " ", text)  # removes punctuations.
print(sentense1)

我试图删除有数字的单词。在上面的句子中,我们有这样的词:iamwordwith4number或234f。 所以我想把它们去掉。如果我对第二个正则表达式行进行注释,它将起作用。我不确定这是否有依赖性。你能给我一些建议吗


Tags: thetextbrimportreisproductthis
1条回答
网友
1楼 · 发布于 2024-09-30 14:18:09

第二个正则表达式应如下所示:

sentense1 = re.sub('[^A-Za-z0-9]+', " ", sentense1)  # removes punctuations.

与此相反:

sentense1 = re.sub('[^A-Za-z0-9]+', " ", text)  # removes punctuations.

相关问题 更多 >