从字符串中根据子字符串匹配和字符串索引获取子字符串

myWord = "mollis" rawText = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse sit amet arcu vulputate, sodales arcu non, finibus odio. Aliquam sed tincidunt nisi, eu scelerisque lectus. Curabitur in nibh enim. Duis arcu ante, mollis sed iaculis non, hendrerit ut odio. Curabitur gravida condimentum posuere. Sed et arcu finibus felis auctor mollis et id risus. Nam urna tellus, ultricies a aliquam at, euismod et erat. Cras pretium venenatis ornare. Donec pulvinar dui eu dui facilisis commodo. Vivamus eget ultrices turpis, vel egestas lacus." # The index where the word is located wordIndexNumber = rawText.lower().find("%s" % (myWord,)) # The total length of the text (in chars) textLength = len(rawText) textPart2 = len(rawText)-wordIndexNumber if wordIndexNumber < 80: textIndex1 = 0 else: textIndex1 = wordIndexNumber - 80 if textPart2 < 80: textIndex2 = textLength else: textIndex2 = wordIndexNumber + 80 snippet = rawText[textIndex1:textIndex2] print (snippet)

2条回答

网友
1楼 · 编辑于 2024-10-05 14:27:01

这是一种使用字符串切片的方法

演示：

rawText= "This is an example lorem ipsum sentence for a Stackoverflow question."
myWord = "sentence"
rawTextList = rawText.split()
frontVal = " ".join( rawTextList[rawTextList.index(myWord)-3:rawTextList.index(myWord)] )
backVal = " ".join( rawTextList[rawTextList.index(myWord):rawTextList.index(myWord)+4] )

print("{} {}".format(frontVal, backVal))

输出：

example lorem ipsum sentence for a Stackoverflow

网友

2楼 · 编辑于 2024-10-05 14:27:01

下面是使用数组切片的解决方案

def get_context_around(text, word, accuracy):
    words = text.split()
    first_hit = words.index(word)

    return ' '.join(words[first_hit - accuracy:first_hit + accuracy + 1])


raw_text= "This is an example lorem ipsum sentence for a Stackoverflow question."
my_word = "sentence"
print(get_context_around(raw_text, my_word, accuracy=3)) # example lorem ipsum sentence for a Stackoverflow

相关问题更多 >

编程相关推荐

热门问题

热门文章