有没有更好的方法来计算句子中的标点符号？

for line in textContent: numberOfFullStops += line.count(".") numberOfQuestionMarks += line.count("?") numberOfQuestionMarks += line.count("!") numberOfSentences = numberOfFullStops + numberOfQuestionMarks + numberOfExclamationMarks

1条回答

网友

1楼 · 发布于 2024-10-01 13:31:14

假设您想计算一个句子中的终端标点符号，我们可以通过循环每个字符串的字符并过滤标点来生成（character，count）对的字典。在

演示

以下是三种自上而下的中间到初级数据结构选项：

import collections as ct


sentence = "Here is a sentence, and it has some exclamations!!"
terminals = ".?!"

# Option 1 - Counter and Dictionary Comprehension
cd = {c:val for c, val in ct.Counter(sentence).items() if c in terminals}
cd
# Out: {'!': 2}


# Option 2 - Default Dictionary
dd = ct.defaultdict(int)
for c in sentence:
    if c in terminals:
        dd[c] += 1
dd
# Out: defaultdict(int, {'!': 2})


# Option 3 - Regular Dictionary
d = {}
for c in sentence:
    if c in terminals:
        if c not in d:
            d[c] = 0
        d[c] += 1
d
# Out: {'!': 2}

为了进一步扩展，对于一个单独的sentences的列表，循环使用后一个选项。在

^{pr2}$
注：要求出每个句子的标点总数，请加上dict.values()，例如sum(cd.values())。在
更新：假设您要通过词尾点选来拆分句子，请使用正则表达式：
import re line = "Here is a string of sentences. How do we split them up? Try regular expressions!!!" # Option - Regular Expression and List Comprehension pattern = r"[.?!]" sentences = [sentence for sentence in re.split(pattern, line) if sentence] sentences # Out: ['Here is a string of sentences', ' How do we split them up', ' Try regular expressions'] len(sentences) # Out: 3
注意line有5个词尾，但只有3个句子。因此regex是一种更可靠的方法。在
参考文献
^{}
^{}
^{}
List comprehension

相关问题更多 >

编程相关推荐

热门问题

热门文章