为什么if不在（x，y）中在python中根本不起作用

data['text'].head(5) Out[38]: 0 ['ve, searching, right, words, thank, breather... 1 [free, entry, 2, wkly, comp, win, fa, cup, fin... 2 [nah, n't, think, goes, usf, ,, lives, around,... 3 [even, brother, like, speak, ., treat, like, a... 4 [date, sunday, !, !] Name: text, dtype: object

import pandas as pd from nltk.corpus import stopwords from nltk.tokenize import word_tokenize import string data = pd.read_csv(r"D:/python projects/read_files/SMSSpamCollection.tsv", sep='\t', header=None) data.columns = ['label','text'] stopwords = set(stopwords.words('english')) def process(df): data = word_tokenize(df.lower()) data = [word for word in data if word not in (stopwords,string.punctuation)] return data data['text'] = data['text'].apply(process)

3条回答

网友

1楼 · 编辑于 2024-10-02 14:24:42

在函数过程中，必须将类型（字符串）转换为pandas.core.series.series并使用海螺

该职能将是：

" def过程（df）：

  data = word_tokenize(df.lower())

  data = [word for word in data if word not in 
  pd.concat([stopwords,pd.Series(string.punctuation)])  ]

  return data

网友

2楼 · 编辑于 2024-10-02 14:24:42

如果您仍然希望在一个if语句中执行此操作，则可以将string.punctuation转换为一个集合，并将其与stopwords和OR操作结合起来。这就是它的样子：

data = [word for word in data if word not in (stopwords|set(string.punctuation))]

网友

3楼 · 编辑于 2024-10-02 14:24:42

那你需要换衣服了

data = [word for word in data if word not in (stopwords,string.punctuation)]

到

data = [word for word in data if word not in stopwords and word not in string.punctuation]

相关问题更多 >

编程相关推荐

热门问题

热门文章