在Python中，将列表中的关键字与一行单词相匹配

[40.748330000000003, -73.878609999999995] 6 2011-08-28 19:52:47 Sometimes I wish my life was a movie; #unreal I hate the fact I feel lonely surrounded by so many ppl [37.786221300000001, -122.1965002] 6 2011-08-28 19:55:26 I wish I could lay up with the love of my life And watch cartoons all day.

try: KeywordFileName=input('Input keyword file name: ') KeywordFile = open(KeywordFileName, 'r') except FileNotFoundError: print('The file you entered does not exist or is not in the directory') exit() KeyLine = KeywordFile.readline() while KeyLine != '': if list != []: KeyLine = KeywordFile.readline() KeyLine = KeyLine.rstrip() list = KeyLine.split(',') list[1] = int(list[1]) print(list) else: break try: TweetFileName = input('Input Tweet file name: ') TweetFile = open(TweetFileName, 'r') except FileNotFoundError: print('The file you entered does not exist or is not in the directory') exit() TweetLine = TweetFile.readline() while TweetLine != '': TweetLine = TweetFile.readline() TweetLine = TweetLine.rstrip()

1条回答

网友

1楼 · 发布于 2024-09-27 07:32:48

您可以使用简单的正则表达式来提取单词，并使用标记器来计算每个单词在示例字符串中的出现次数。在

from nltk.tokenize import word_tokenize
import collections
import re

str = '[40.748330000000003, -73.878609999999995] 6 2011-08-28 19:52:47 Sometimes I wish my life was a movie; #unreal I hate the fact I feel lonely surrounded by so many ppl'
num_regex = re.compile(r"[+-]?\d+(?:\.\d+)?")
str = num_regex.sub('',str)
words = word_tokenize(str)
final_list = collections.Counter(words)
print final_list

相关问题更多 >

编程相关推荐

热门问题

热门文章