Python:从字符串中检测单词并找到其位置

2024-09-29 22:33:03 发布

您现在位置:Python中文网/ 问答频道 /正文

我是python新手,我想制作一个简单的程序,用詹姆斯·邦德的风格打印出你的名字和介词

因此,如果名称包含任何介词,如“Van”、“Von”、“De”或“Di”,我希望程序将其打印为:

{Preposition} {LastName}, {FirstName} {Preposition} {LastName} *edited

为此,我知道我们需要一个用户名和介词的列表

a = [user input separated with the .split function]
b = [list of prepositions]

为了在名称中找到介词的实例,我发现可以使用下面的代码:

if any(x in a for x in b):

然而,我在打印名字时遇到了一个问题,因为介词可以是前面提到的(列表b)中的任何一个。我无法找到一种打印方法,而不知道它及其在字符串中的位置。首先,我认为可以使用.index函数,但它似乎只能搜索一个单词,而不能根据需要搜索多个单词。我能得到的最接近的结果是:

name_split.index('preposition1') # works
name_split.index('preposition1', 'preposition2', etc.) # does not work

所以我要问的是是否有一种方法可以检查列表(b)中的任何单词是否在输入的文本中使用,并获取所述单词的位置

希望我能解释清楚,希望有人能给我一些帮助。提前多谢各位


Tags: 方法namein程序名称列表index名字
3条回答

我想不出比使用for循环更好的方法:

pattern = "{1} {2}, {0} {1} {2}"
prepositions = ['van', 'von', 'de', 'di']

# (optional) 'lower' so that we don't have to consider cases like 'vAn'
name = "Vincent van Gogh".lower()
index = -1  # by default, we believe that we did not find anything
for preposition in prepositions:
    # 'find' is the same as 'index', but returns -1 if the substring is not found
    index = name.find(preposition)
    if index != -1:
        break  # found an entry

if index == -1:
    print("Not found")
else:
    print("The index is", index,
          "and the preposition is", preposition)
    print(pattern.format(*name.split()))

产出:

The index is 8 and the preposition is van
van gogh, vincent van gogh

如果要遍历名称列表,可以执行以下操作:

pattern = ...
prepositions = ...
names = ...

for name in names:
    name = name.lower()
    ... # the rest is the same

第二类介词("Jr.", "Sr.")的新版本:

def check_prepositions(name, prepositions):
    index = -1

    for preposition in prepositions:
        index = name.find(preposition)
        if index != -1:
            break  # found an entry

    return index, preposition


patterns = [
    "{1} {2}, {0} {1} {2}",
    "{1}, {0} {1} {2}"
]

all_prepositions = [
    ['van', 'von', 'de', 'di'],
    ["Jr.", "Sr."]
]

names = ["Vincent van Gogh", "Robert Downey Jr.", "Steve"]

for name in names:
    for pattern, prepositions in zip(patterns, all_prepositions):
        index, preposition = check_prepositions(name, prepositions)

        if index != -1:
            print("The index is", index,
                  "and the preposition is", preposition)
            print(pattern.format(*name.split()))
            break

    if index == -1:
        print("Not found, name:", name)

产出:

The index is 8 and the preposition is van
van Gogh, Vincent van Gogh
The index is 14 and the preposition is Jr.
Downey, Robert Downey Jr.
Not found, name: Steve

使用正则表达式的不同方法(我知道)

import re

def process_input(string: str) -> str:
    string = string.strip()
    # Preset some values.
    ln, fn, prep = "", "", ""

    # if the string is blank, return it
    # Otherwise, continue.
    if len(string) > 0:

        # Search for possible delimiter.
        res = re.search(r"([^a-z0-9-'\. ]+)", string, flags = re.I)

        # If delimiter found...
        if res:
            delim = res.group(0)

            # Split names by delimiter and strip whitespace.
            ln, fn, *err = [s.strip() for s in re.split(delim, string)]
     
        else:
            # Split on whitespace
            names = [s.strip() for s in re.split(r"\s+", string)]

            # If first, preposition, last exist or first and last exist.
            # update variables.
            # Otherwise, raise ValueError.
            if len(names) == 3:
                fn, prep, ln = names
            elif len(names) == 2:
                fn, ln = names
            else:
                raise ValueError("First and last name required.")

        # Check for whitespace in last name variable.
        ws_res = re.search(r"\s+", ln)
        if ws_res:
            # Split last name if found.
            prep, ln, *err = re.split(r"\s+", ln)
        
        # Create array of known names.
        output = [f"{ln},", fn, ln]

        # Insert prep if it contains a value
        # This is simply a formatting thing.
        if len(prep) > 0:
            output.insert(2, prep)

        # Output formatted string.
        return " ".join(output)

    return string


if __name__ == "__main__":
    # Loop until q called or a max run amout is reached.
    re_run = True
    max_runs = 10

    while re_run or max_runs > 0:
        print("Please enter your full name\nor press [q] to exit:")
        user_input = input()
        if user_input:
            if user_input.lower().strip() == "q":
                re_run = False
                break

            result = process_input(user_input)
            print("\n" + result + "\n\n")
            max_runs -= 1

为什么名字中的介词很重要?您不会在任何地方打印它,您真正关心的是,以及名的其余部分。不需要寻找介词,只需使用^{}从右边拆分,并要求maxsplit为1。例如:

>>> "Vincent van Gogh".rsplit(" ", 1)
['Vincent van', 'Gogh']

>>> "James Bond".rsplit(" ", 1)
['James', 'Bond']

然后,您可以简单地打印您认为合适的值

fname, lname = input_name.rsplit(" ", 1)
print(f"{lname}, {fname} {lname}")

使用input_name = "Vincent van Gogh"打印Gogh, Vincent van Gogh。使用input_name = "James Bond",您将得到Bond, James Bond

这样做的另一个好处是,如果人们输入中间名/首字母,它也可以工作

>> fname, lname = "Samuel L. Jackson".rsplit(" ", 1)
>> print(f"{lname}, {fname} {lname}")
Jackson, Samuel L. Jackson

注意,人们写名字的方式有很多奇怪之处,因此值得一看Falsehoods Programmers Believe About Names

相关问题 更多 >

    热门问题