<blockquote>
<p><strong>QUESTION 1</strong>: Is there a python regexp with capture groups that would
let me access the section/sub section names as a capture group?</p>
<blockquote>
<p>a single regexp to match the two - three "groups". May not exist</p>
</blockquote>
</blockquote>
<p>是的,这是可以做到的。我们可以将条件分解为以下树:</p>
<ul>
<li><kbd>行首</kbd><strong>+</strong><kbd>0到2个空格</kbd></li>
<li>两种交替:
<ol>
<li><code>***</code><strong>+</strong><kbd>任何文本</kbd><sup>[组1]</sup></li>
<li><kbd>1+空格</kbd><strong>+</strong><code>***</code><strong>+</strong><kbd>任何文本</kbd><sup>[group 2]</sup></li>
</ol></li>
<li><code>***</code><sup>(可选)</sup><strong>+</strong><kbd>行尾</kbd></li>
</ul>
<p><br/>
上面的树可以用以下模式表示:</p>
<pre class="lang-none prettyprint-override"><code>^[ ]{0,2}(?:[*]{3}(.*?)|[ ]+[*]{3}(.*?))(?:[*]{3})?$
</code></pre>
<ul>
<li><a href="https://regex101.com/r/mV0gN4/1" rel="nofollow">regex101 DEMO</a></li>
</ul>
<p>注意<em>节</em>和<em>子节</em>被不同的组捕获(<sup>[组1]</sup>和<sup>[组2]</sup>)。它们都使用相同的语法<code>.*?</code>,都带有一个<a href="http://www.regular-expressions.info/repeat.html#lazy" rel="nofollow">lazy quantifier (the extra "?")</a>,以允许结尾的可选<code>"***"</code>匹配。在</p>
<hr/>
<blockquote>
<p><strong>QUESTION 2</strong>: How would the regexp groups allow me to ID section
or sub section (possibly based on the number of /content in a match.group)?</p>
</blockquote>
<p>上述regex只在组1中捕获<em>部分</em>,而<em>子节</em>仅在组2中捕获。为了在代码中更容易识别,我将使用<a href="http://www.regular-expressions.info/named.html" rel="nofollow">^{<cd6>}</a>并使用<strong><a href="https://docs.python.org/2/library/re.html#re.MatchObject.groupdict" rel="nofollow">^{<cd7>}</a></strong>检索捕获。在</p>
<h3>代码:</h3>
^{pr2}$
<ul>
<li><a href="http://ideone.com/9fRpY6" rel="nofollow">ideone DEMO</a></li>
</ul>
<p>为了引用每个<em>节</em>/<em>小节</em>,您可以使用以下方法之一,而不是打印dict:</p>
<pre><code>match.group("Section")
match.group(1)
match.group("SubSection")
match.group(2)
</code></pre>