re.sub使用“预期的字符串或类似于对象的字节”时出错

2024-05-17 21:02:01 发布

您现在位置:Python中文网/ 问答频道 /正文

我已经读了很多关于这个错误的文章,但是我仍然无法理解。当我试图循环使用我的函数时:

def fix_Plan(location):
    letters_only = re.sub("[^a-zA-Z]",  # Search for all non-letters
                          " ",          # Replace all non-letters with spaces
                          location)     # Column and row to search    

    words = letters_only.lower().split()     
    stops = set(stopwords.words("english"))      
    meaningful_words = [w for w in words if not w in stops]      
    return (" ".join(meaningful_words))    

col_Plan = fix_Plan(train["Plan"][0])    
num_responses = train["Plan"].size    
clean_Plan_responses = []

for i in range(0,num_responses):
    clean_Plan_responses.append(fix_Plan(train["Plan"][i]))

错误如下:

Traceback (most recent call last):
  File "C:/Users/xxxxx/PycharmProjects/tronc/tronc2.py", line 48, in <module>
    clean_Plan_responses.append(fix_Plan(train["Plan"][i]))
  File "C:/Users/xxxxx/PycharmProjects/tronc/tronc2.py", line 22, in fix_Plan
    location)  # Column and row to search
  File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36\lib\re.py", line 191, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object

Tags: inpycleanforlinetrainlocationresponses
3条回答

我想最好是使用re.match()函数。这是一个可以帮助你的例子。

import re
import nltk
from nltk.tokenize import word_tokenize
nltk.download('punkt')
sentences = word_tokenize("I love to learn NLP \n 'a :(")
#for i in range(len(sentences)):
sentences = [word.lower() for word in sentences if re.match('^[a-zA-Z]+', word)]  
sentences

如您在注释中所述,某些值似乎是浮点数,而不是字符串。在将其传递给re.sub之前,需要将其更改为字符串。最简单的方法是在使用re.sub时将location更改为str(location)。即使已经是一个str了,无论如何这样做也不会有什么坏处。

letters_only = re.sub("[^a-zA-Z]",  # Search for all non-letters
                          " ",          # Replace all non-letters with spaces
                          str(location))

最简单的解决方案是将python str函数应用于试图循环遍历的列。

如果你用熊猫 这可以实现为

dataframe['column_name']=dataframe['column_name'].apply(str)

相关问题 更多 >