<p>你可以用</p>
<pre class="lang-py prettyprint-override"><code>re.findall(r'<([^<>]*)>(\w+)', text)
</code></pre>
<p>见<a href="https://regex101.com/r/hmEB5t/1" rel="nofollow noreferrer">regex demo</a><em>详细信息</em>:</p>
<ul>
<li><code><([^<>]*)></code>-<code><</code>,然后将除<code><</code>和<code>></code>之外的零个或多个字符捕获到组1中,然后<code>></code></li>
<li><code>(\w+)</code>-第2组:一个或多个单词字符</李>
</ul>
<p>见<a href="https://ideone.com/uotH7j" rel="nofollow noreferrer">Python demo</a>:</p>
<pre class="lang-py prettyprint-override"><code>import re
text = "तत् इदम् <गीता-शास्त्रम्>K7 <<<<<समस्त-वेद>K1-अर्थ>T6-सार>T6-संग्रह>T6-भूतम्>T2 <दुर्विज्ञेय-अर्थम्>K1 <<तत्-अर्थ>T6-आविष्करणाय>T6 अनेकैः <विवृत-<<<पद-<पद-अर्थ>T6-<वाक्य-अर्थ>T6>Di-न्यायम्>T6>Bs6 अपि <<अत्यन्त-विरुद्ध>K1-<अनेक-अर्थ>K1>K1 त्वेन लौकिकैः गृह्यमाणम् उपलभ्य अहम् विवेकतः <<अर्थ-निर्धारण>T6-अर्थम्>T4 संक्षेपतः विवरणम् करिष्यामि\n<अभ्युदय-अर्थः>T4 अपि यः <प्रवृत्ति-लक्षणः>Bs6 धर्मः वर्णान् आश्रमान् च उद्दिश्य विहितः सः <<<<देव-आदि>Bs6-स्थान>T6-प्राप्ति>T6-हेतुः>T6 अपि सन् <<ईश्वर-अर्पण>T6-बुद्ध्या>T6 अनुष्ठीयमानः <सत्त्व-शुद्धये>T6 भवति <<फल-अभिसन्धि>T6-वर्जितः>T3"
matches = list(re.finditer(r'<([^<>]*)>(\w+)', text))
# Show overall matches and their positions:
for m in matches:
print( "Match: ", m.group(), ", Start position: ", m.start(), sep="")
print(" -")
# Show groups and their positions:
for m in matches:
print( "Word: ", m.group(1), ", Word start position: ", m.start(1),
", Tag: ", m.group(2), ", Tag start position: ", m.start(2), sep="")
</code></pre>
<p>输出:</p>
<pre class="lang-py prettyprint-override"><code>Match: <गीता-शास्त्रम्>K7, Start position: 9
Match: <समस्त-वेद>K1, Start position: 32
Match: <दुर्विज्ञेय-अर्थम्>K1, Start position: 80
Match: <तत्-अर्थ>T6, Start position: 105
Match: <पद-अर्थ>T6, Start position: 152
Match: <वाक्य-अर्थ>T6, Start position: 164
Match: <अत्यन्त-विरुद्ध>K1, Start position: 202
Match: <अनेक-अर्थ>K1, Start position: 222
Match: <अर्थ-निर्धारण>T6, Start position: 285
Match: <अभ्युदय-अर्थः>T4, Start position: 341
Match: <प्रवृत्ति-लक्षणः>Bs6, Start position: 366
Match: <देव-आदि>Bs6, Start position: 436
Match: <ईश्वर-अर्पण>T6, Start position: 489
Match: <सत्त्व-शुद्धये>T6, Start position: 530
Match: <फल-अभिसन्धि>T6, Start position: 555
-
Word: गीता-शास्त्रम्, Word start position: 10, Tag: K7, Tag start position: 25
Word: समस्त-वेद, Word start position: 33, Tag: K1, Tag start position: 43
Word: दुर्विज्ञेय-अर्थम्, Word start position: 81, Tag: K1, Tag start position: 100
Word: तत्-अर्थ, Word start position: 106, Tag: T6, Tag start position: 115
Word: पद-अर्थ, Word start position: 153, Tag: T6, Tag start position: 161
Word: वाक्य-अर्थ, Word start position: 165, Tag: T6, Tag start position: 176
Word: अत्यन्त-विरुद्ध, Word start position: 203, Tag: K1, Tag start position: 219
Word: अनेक-अर्थ, Word start position: 223, Tag: K1, Tag start position: 233
Word: अर्थ-निर्धारण, Word start position: 286, Tag: T6, Tag start position: 300
Word: अभ्युदय-अर्थः, Word start position: 342, Tag: T4, Tag start position: 356
Word: प्रवृत्ति-लक्षणः, Word start position: 367, Tag: Bs6, Tag start position: 384
Word: देव-आदि, Word start position: 437, Tag: Bs6, Tag start position: 445
Word: ईश्वर-अर्पण, Word start position: 490, Tag: T6, Tag start position: 502
Word: सत्त्व-शुद्धये, Word start position: 531, Tag: T6, Tag start position: 546
Word: फल-अभिसन्धि, Word start position: 556, Tag: T6, Tag start position: 568
</code></pre>