Python在spli时转换列表中的元素

2024-06-28 18:55:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我有这个密码:

Long_string = """
"Fifty Shades of Grey” shakeup: Kelly Marcel not returning for Sequel
"""

我需要把这条线分解成几个字。我知道:

text_to_list = testing.split()

输出为:

['\xa1\xb0Fifty', 'Shades', 'of', 'Grey\xa1\xb1', 'shakeup:', 'Kelly', 'Marcel', 'not', 'returning', 'for', 'Sequel']

然而,其中一些词在一起时有特殊的含义,如被引用的“”五十种灰色“,以及人们的名字,如“Kelly Marcel”。你知道吗

所以我想把他们变成“五十个灰色阴影”和“凯利马塞尔”当他们分裂。我该怎么做?你知道吗


抱歉给你添麻烦了。需要:

在下列情况下,将空格替换为“-”:

  1. 在字里行间引用
  2. 在两个大写单词之间

Tags: of密码forstringnotlonggreyreturning
3条回答

我会分三部分来做。首先,使用this answer中调整过的regex版本,用-替换两个大写单词之间的空格:

>>> import re
>>> long_string = '"Fifty Shades of Grey" shakeup: Kelly Marcel not returning for Sequel'
>>> long_string = re.sub(r'([A-Z][a-z]+(?=\s[A-Z]))(?:\s([A-Z][a-z]+))+', r'\1-\2', long_string)
>>> long_string
'"Fifty-Shades of Grey" shakeup: Kelly-Marcel not returning for Sequel'

然后,使用shlex库拆分但保留引号:

>>> import shlex
>>> words = shlex.split(long_string)
>>> words
['Fifty-Shades of Grey',
 'shakeup:',
 'Kelly-Marcel',
 'not',
 'returning',
 'for',
 'Sequel']

然后使用列表理解将每个标记内的所有剩余空格替换为-

>>> final = [x.replace(' ', '-') for x in words]
>>> final
['Fifty-Shades-of-Grey',
 'shakeup:',
 'Kelly-Marcel',
 'not',
 'returning',
 'for',
 'Sequel']

这可能会有帮助。(不需要正则表达式)

Long_string = """"Fifty Shades of Grey" shakeup: Kelly Marcel not returning for Sequel"""

previous_word_uppercase = 0
count = 0
buffer = ""
final_buffer = ""

text_to_list_prev = Long_string.split('"')

for i in text_to_list_prev:
    j = i
    if count%2 != 0:
        j = '"' + i.replace(" ", "-") +'"'
    buffer = buffer + j
    count += 1

text_to_list = buffer.split(" ")
text_to_list2 = buffer.split(" ")

previous_word_uppercase = 0
count = 0

for i in text_to_list:
    j = i
    if i[0].isupper():
        if previous_word_uppercase == 1:
            j = "-" + i
            final_buffer = final_buffer +j
        else:
            final_buffer = final_buffer +" "+j
        previous_word_uppercase = 1
    else:
        previous_word_uppercase = 0
        final_buffer = final_buffer +" "+j
    count = count +1

print(final_buffer)

输出

"Fifty-Shades-of-Grey" shakeup: Kelly-Marcel not returning for Sequel

只需regexp将带引号的单词之间的空格替换为“-”。
下面是一个示例

import re
Long_string = """
"Fifty Shades of Grey" shakeup: Kelly Marcel not returning for Sequel
"""
def check_sting(text):
    matches=re.findall(r'\"(.+?)\"|([A-Z][a-z]+(?=\s[A-Z])(?:\s[A-Z][a-z]+)+)',Long_string)
    for i in matches:
        for idx,val in enumerate(i):
            temp=i[idx].replace(" ","-")
            if(temp):
                yield temp
#
for j in check_sting(Long_string):
print(j)

上面的代码可能效率不高,它只是给你一个例子,告诉你可以使用regexp作为字符串搜索模式,你可以通过regexp来改进上面的代码

相关问题 更多 >