擅长:python、mysql、java
<p>尝试以下正则表达式方法:</p>
<pre><code>blanks = re.findall(r'\bCode\b.*?\bDescription\s*?(\S+)\s+.*?\r?\n(\d+)\s+.*?(?=\r?\n\r?\n)', inp, flags=re.DOTALL)
print(blanks)
reviews = re.findall(r'\bCode\b.*?\bDescription\s*?\S+\s+(.*?)\r?\n\d+\s+(.*?)(?=\r?\n\r?\n)', inp, flags=re.DOTALL)
</code></pre>
<p>这张照片是:</p>
<pre><code>[('Blank', '1'), ('Blank1', '11')]
[('Not reviewed, or reviewed and corrected\n', 'Reviewed and confirmed as reported: A patient had behavior \ncode of in situ and laterality is not\nstated as right: origin of primary; left: origin of primary; or only one side \ninvolved, right or left\norigin not specified'), ('Not reviewed\n', 'A patient had laterality \ncoded non-specifically and\nextension coded specifically')]
</code></pre>
<p>这里的想法是只匹配并捕获输入文本的<code>Code ... Description ... Blank</code>部分的各种所需部分。注意,这个答案假设您已经将文本读入Python字符串变量</p>