在字符串中查找索引词

2024-06-29 01:13:36 发布

您现在位置:Python中文网/ 问答频道 /正文

用Python编写一个程序,在文本输出中打印关键字(以大写字母开头的单词)以及字号(多个单词)。如果在文本中找不到具有此属性的单词,请在“无”输出中打印它。句子开头的单词不应被视为索引词。(从1开始输入字号)

除索引词外,不计算数字。除了句号外,句子中唯一使用的符号是逗号。如果分号位于单词末尾,请务必将其删除

输入 波斯联盟是专门为伊朗贫困地区举办的最大的体育赛事。波斯联盟促进和平与友谊。 这段视频是我们的一位希望和平的英雄拍摄的

输出强文本 2:波斯语 3:联盟 15:伊朗 17:波斯语 18:联盟

如何修复它?://strong文本

enter code here
import re

inputText = ""

# we will use this regex pattern to check if a word is started with upperCase
isUpperCase = re.compile("^([A-Z])\w+")

# we will store upperCase words in this array
result = []
# number of word in the hole input
wordIndex = 0

# separate sentences
sentences = inputText.strip().split('.')

for s in sentences:
 # get array of words in each sentence
 words = s.strip().split(' ')

 for index, word in enumerate(words):
 # increase wordIndex
 wordIndex += 1

 # just pass first word
 if index == 0:
 continue

 # check regex and if true add word and wordIndex to result
 if isUpperCase.match(word):
 result.append({
 "index": wordIndex,
 "word": word
 })


# finally print result
for word in result:
 print(word["index"], ": ", word["word"])

Tags: in文本reforindexifsentencesresult
1条回答
网友
1楼 · 发布于 2024-06-29 01:13:36

您可以将每个大写单词及其索引值(加一)添加到字典中。我注意到您没有返回TheThis,但我不知道规则是什么,所以我只是为这两个词添加了一个豁免

import collections
uppercase_letters = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'F', 'H', 'I', \
                     'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', \
                     'T', 'U', 'V', 'W', 'X', 'Y', 'Z']

mystring = 'The Persian League is the largest sport event dedicated to the deprived areas of Iran. The Persian League promotes peace and friendship. This video was captured by one of our heroes who wishes peace.'
counter_dict = collections.defaultdict(list)

counter = 0
overall_counter = 0
for i in mystring.split('.'):
    counter = 0
    for j in i.split():
        counter += 1
        overall_counter += 1
        if counter == 1:
            continue
        if j[0] in uppercase_letters:
            counter_dict[i].append(overall_counter)

counter_list = []
    
for i in counter_dict.values():
    for j in i:
        counter_list.append(j)

for i in sorted(counter_list):
    print(str(i) + ':', mystring.split()[i-1].rstrip('.'), '', sep=' ', end='', flush=True)

>>> 2: Persian 3: League 15: Iran 17: Persian 18: League 

相关问题 更多 >