使用python将段落裁剪成行

2024-09-28 19:31:36 发布

您现在位置:Python中文网/ 问答频道 /正文

假设我有一个长的文本块(保存在记事本中),326个字符长。你知道吗

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam et nibh augue. Sed dignissim eu odio nec efficitur. Nulla aliquam imperdiet ipsum, eu mollis lacus cursus quis. Nulla dictum sem sem in auctor erat imperdiet sed suscipit elit ut lacus vestibulum vitae consequat risus volutpat. Suspendisse suscipit velit id.

我想说:

Lorem ipsum dolor sit amet,
consectetur adipiscing elit.
Nam et nibh augue.
Sed dignissim eu odio nec efficitur.
Nulla aliquam imperdiet ipsum,
eu mollis lacus cursus quis.
Nulla dictum sem sem in auctor erat imperdiet sed suscipit
elit ut lacus vestibulum vitae consequat risus volutpat.
Suspendisse suscipit velit id.

我希望它采取的步骤:

  • 如果有句点,添加新行。

  • 如果有逗号,请添加新行。

  • 如果行仍然太长(如超过60个字符),请在下一个空格处添加新行。


Tags: ipsumeuloremdolorametsit个字符sem
2条回答

这样做可以:

s = """Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam et nibh augue. Sed dignissim eu odio nec efficitur. Nulla aliquam imperdiet ipsum, eu mollis lacus cursus quis. Nulla dictum sem sem in auctor erat imperdiet sed suscipit elit ut lacus vestibulum vitae consequat risus volutpat. Suspendisse suscipit velit id."""
result = "\n".join(re.findall(r"(.{,59}?(?:,|\.)|.{58}\S*)\s*", s))
print(result)

结果:

Lorem ipsum dolor sit amet,
consectetur adipiscing elit.
Nam et nibh augue.
Sed dignissim eu odio nec efficitur.
Nulla aliquam imperdiet ipsum,
eu mollis lacus cursus quis.
Nulla dictum sem sem in auctor erat imperdiet sed suscipit
elit ut lacus vestibulum vitae consequat risus volutpat.
Suspendisse suscipit velit id.

^{}说明:

.{,59}?(?:,|\.)匹配前面少于59个字符的任何,..{58}\S*匹配任何超过58个字符的字符,直到下一个单词。 最后,\s*匹配任何空空间,以便将其修剪掉。你知道吗

使用本机Python代码的解决方案是对第一种情况使用replace,然后使用wordwrap。通过re使用regex可能效率更高,但对于任何试图理解您的代码的人来说,这都更易读。它也长得多,我不知道你是否在乎。不管怎样,我的解决方案是:

def wordwrap(text, limit=60):
    """Just a function to wrap words to line length 60."""
    words = text.split(" ")
    lines_out = [words[0]]
    for word in words[1:]:
        last_line = lines_out[-1]
        if len(last_line)+len(word) > limit:
            lines_out.append(word)
        else:
            lines_out[-1] = last_line+" "+word
    return "\n".join(lines_out)


def process(int):
    """Process your input to follow all you rules"""
    # Add newlines after commas and periods
    inp = inp.replace(".", ".\n").replace(",", ",\n")

    lines = inp.split("\n")
    lines = [l.strip() for l in lines]  # Remove any spaces before lines

    # Wrap remaining lines
    lines = [wordwrap(l) for l in lines]
    return "\n".join(lines)

# A test.
if __name__ == "__main__":
    text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam et nibh augue. Sed dignissim eu odio nec efficitur. Nulla aliquam imperdiet ipsum, eu mollis lacus cursus quis. Nulla dictum sem sem in auctor erat imperdiet sed suscipit elit ut lacus vestibulum vitae consequat risus volutpat. Suspendisse suscipit velit id."
    print process(text)

相关问题 更多 >