正则表达式忽略的特殊字符串填充字符

1条回答

网友

1楼 · 发布于 2024-10-05 14:21:44

在替换空白时，需要保持从变换位置到原始位置的映射。你知道吗

def sub_with_mapping(pattern, repl, string):
    mapping = dict()
    source_index = [0]
    target_index = [0]
    target = []
    #
    def append_segment(original_len, text, keep_positions):
        for i in range(0, len(text)):
            mapping[target_index[0] + i] = source_index[0] + (i if keep_positions else 0)
        target.append(text)
        target_index[0] += len(text)
        source_index[0] += original_len
    #
    for match in re.finditer(pattern, string):
        # Append text between previous and this match
        prefix = string[source_index[0]:match.start()]
        append_segment(len(prefix), prefix, True)
        # Append replacement text
        replacement = repl
        if callable(replacement):
            replacement = replacement(match)
        append_segment(len(match.group(0)), replacement, False)
    # Append text after last match
    suffix = string[source_index[0]:]
    append_segment(len(suffix), suffix, True)
    return ''.join(target), mapping

>>> sub_with_mapping(r'\w+', 'x', 'foo  - bar  - baz')
('x  - x  - x', {0: 0, 1: 3, 2: 4, 3: 5, 4: 6, 5: 7, 6: 8, 7: 11, 8: 12, 9: 13, 10: 14, 11: 15, 12: 16})

然后可以对返回的字符串进行匹配，并将索引映射回原始字符串。你知道吗

请注意，如果替换长度超过一个字符，则会有多个索引映射回同一原始索引。你知道吗

如果字符串很长（>；1 MB？），您可能需要优化数据结构。也许你可以用靶场？你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章

正则表达式忽略的特殊字符串填充字符

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >