Python：从字符串中检测单词并找到其位置

3条回答

网友

1楼 · 编辑于 2024-09-29 22:33:03

我想不出比使用for循环更好的方法：

pattern = "{1} {2}, {0} {1} {2}"
prepositions = ['van', 'von', 'de', 'di']

# (optional) 'lower' so that we don't have to consider cases like 'vAn'
name = "Vincent van Gogh".lower()
index = -1  # by default, we believe that we did not find anything
for preposition in prepositions:
    # 'find' is the same as 'index', but returns -1 if the substring is not found
    index = name.find(preposition)
    if index != -1:
        break  # found an entry

if index == -1:
    print("Not found")
else:
    print("The index is", index,
          "and the preposition is", preposition)
    print(pattern.format(*name.split()))

产出：

The index is 8 and the preposition is van
van gogh, vincent van gogh

如果要遍历名称列表，可以执行以下操作：

pattern = ...
prepositions = ...
names = ...

for name in names:
    name = name.lower()
    ... # the rest is the same

第二类介词（"Jr.", "Sr."）的新版本：

def check_prepositions(name, prepositions):
    index = -1

    for preposition in prepositions:
        index = name.find(preposition)
        if index != -1:
            break  # found an entry

    return index, preposition


patterns = [
    "{1} {2}, {0} {1} {2}",
    "{1}, {0} {1} {2}"
]

all_prepositions = [
    ['van', 'von', 'de', 'di'],
    ["Jr.", "Sr."]
]

names = ["Vincent van Gogh", "Robert Downey Jr.", "Steve"]

for name in names:
    for pattern, prepositions in zip(patterns, all_prepositions):
        index, preposition = check_prepositions(name, prepositions)

        if index != -1:
            print("The index is", index,
                  "and the preposition is", preposition)
            print(pattern.format(*name.split()))
            break

    if index == -1:
        print("Not found, name:", name)

产出：

The index is 8 and the preposition is van
van Gogh, Vincent van Gogh
The index is 14 and the preposition is Jr.
Downey, Robert Downey Jr.
Not found, name: Steve

网友

2楼 · 编辑于 2024-09-29 22:33:03

使用正则表达式的不同方法（我知道）

import re

def process_input(string: str) -> str:
    string = string.strip()
    # Preset some values.
    ln, fn, prep = "", "", ""

    # if the string is blank, return it
    # Otherwise, continue.
    if len(string) > 0:

        # Search for possible delimiter.
        res = re.search(r"([^a-z0-9-'\. ]+)", string, flags = re.I)

        # If delimiter found...
        if res:
            delim = res.group(0)

            # Split names by delimiter and strip whitespace.
            ln, fn, *err = [s.strip() for s in re.split(delim, string)]
     
        else:
            # Split on whitespace
            names = [s.strip() for s in re.split(r"\s+", string)]

            # If first, preposition, last exist or first and last exist.
            # update variables.
            # Otherwise, raise ValueError.
            if len(names) == 3:
                fn, prep, ln = names
            elif len(names) == 2:
                fn, ln = names
            else:
                raise ValueError("First and last name required.")

        # Check for whitespace in last name variable.
        ws_res = re.search(r"\s+", ln)
        if ws_res:
            # Split last name if found.
            prep, ln, *err = re.split(r"\s+", ln)
        
        # Create array of known names.
        output = [f"{ln},", fn, ln]

        # Insert prep if it contains a value
        # This is simply a formatting thing.
        if len(prep) > 0:
            output.insert(2, prep)

        # Output formatted string.
        return " ".join(output)

    return string


if __name__ == "__main__":
    # Loop until q called or a max run amout is reached.
    re_run = True
    max_runs = 10

    while re_run or max_runs > 0:
        print("Please enter your full name\nor press [q] to exit:")
        user_input = input()
        if user_input:
            if user_input.lower().strip() == "q":
                re_run = False
                break

            result = process_input(user_input)
            print("\n" + result + "\n\n")
            max_runs -= 1

网友

3楼 · 编辑于 2024-09-29 22:33:03

为什么名字中的介词很重要？您不会在任何地方打印它，您真正关心的是姓，以及名的其余部分。不需要寻找介词，只需使用^{}从右边拆分，并要求maxsplit为1。例如：

>>> "Vincent van Gogh".rsplit(" ", 1)
['Vincent van', 'Gogh']

>>> "James Bond".rsplit(" ", 1)
['James', 'Bond']

然后，您可以简单地打印您认为合适的值

fname, lname = input_name.rsplit(" ", 1)
print(f"{lname}, {fname} {lname}")

使用input_name = "Vincent van Gogh"打印Gogh, Vincent van Gogh。使用input_name = "James Bond"，您将得到Bond, James Bond

这样做的另一个好处是，如果人们输入中间名/首字母，它也可以工作

>> fname, lname = "Samuel L. Jackson".rsplit(" ", 1)
>> print(f"{lname}, {fname} {lname}")
Jackson, Samuel L. Jackson

注意，人们写名字的方式有很多奇怪之处，因此值得一看Falsehoods Programmers Believe About Names

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python：从字符串中检测单词并找到其位置

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >