Python：计算单词列表，除非某些单词在前面

vocab = ["foo", "bar", "baz"] exception= ["no"] s = "foo bar baz no bar quux foo bla bla" wordcount = dict((x,0) for x in vocab) for w in re.findall(r"\w+", s): if w in wordcount: wordcount[w] += 1

3条回答

网友

1楼 · 编辑于 2024-09-29 21:28:04

只需用空字符串替换no和以下三个单词，然后计算结果字符串中的单词。你知道吗

>>> s = 'foo bar baz no bar quux foo bla bla'
>>> vocab = ["foo", "bar", "baz"]
>>> exception= ["no"]
>>> wordcount = dict((x,0) for x in vocab)
>>> m = re.sub(r'(?:^|\s)no(\s+\S+){0,3}', '', s)
>>> for w in re.findall(r"\w+", m):
        if w in wordcount:
            wordcount[w] += 1


>>> wordcount
{'foo': 1, 'bar': 1, 'baz': 1}

网友

2楼 · 编辑于 2024-09-29 21:28:04

实际上，您可以使用Python的字符串来实现这一点，而不需要regex：

vocab = ["foo", "bar", "baz"]
ex_list= ["no"]
s = "foo bar baz no bar quux foo bla bla"

words=s.split()
wordcount = dict((x,0) for x in vocab)
for i, word in enumerate(words):
    if i>=3 and any(w in ex_list for w in words[i-3:i]):
        continue
    elif word in vocab:    
        wordcount[word]+=1

由于切片不会生成索引错误，因此可以将循环简化为：

for i, word in enumerate(words):
    if word in vocab and not any(w in ex_list for w in words[i-3:i]):
        wordcount[word]+=1

网友

3楼 · 编辑于 2024-09-29 21:28:04

关于：

vocab = ["foo", "bar", "baz"]
exception= ["no"]
s = "foo bar baz no bar quux foo bla bla"

wordcount = dict((x,0) for x in vocab)

words = s.split()

i = 0
while i < len(words):
    cur_word = words[i]
    if cur_word in exception:
        i += 4
    else:
        if cur_word in vocab: wordcount[cur_word] += 1
        i += 1

print wordcount  # {'baz': 1, 'foo': 1, 'bar': 1}

这只是利用了一个事实，如果我们遇到“否”，我们可以跳过以下3个元素。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python：计算单词列表，除非某些单词在前面

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >