我正在尝试按顺序拆分介于\n\n和\n之间的文本。以这个字符串为例:
\n\nMy take on fruits.\n\nHealthy Fruits\nAn apple is a fruit and it\'s very good.\n\nPears are good as well. Bananas are very good too and healthy.\n\nSour Fruits\nOranges are on the sour side and contains a lot of vitamin C.\n\nGrapefruits are even more sour, if you can believe it.
我期望的输出是:
[('Healthy Fruits', "An apple is a fruit and it's very good.", 'Pears are good as well. Bananas are very good too and healthy.'), ('Sour Fruits', 'Oranges are on the sour side and contains a lot of vitamin C.', 'Grapefruits are even more sour, if you can believe it.')]
我想这样解析,因为\n\n和\n之间的任何内容都是标题,其余的是标题下的文本(所以是“健康水果”和“酸味水果”。不确定这是否是获取标题及其文本的最佳方式
鉴于:
您可以使用正则表达式:
Demo
Python演示:
这不是正则表达式,但它可以工作:
相关问题 更多 >
编程相关推荐