擅长:python、mysql、java
<p>您可以通过Pythons<code>re</code>模块使用正则表达式来实现这一点。为了只过滤<code>h1</code>标记中的文本,可以使用<code>positive lookbehind</code>和<code>positive lookahead</code>策略</p>
<p><strong>代码:</strong></p>
<pre class="lang-py prettyprint-override"><code>import re
with open("path/to/home.html") as file:
text = file.read()
text = re.sub("(?<=<h1>)\w+ \w+(?=</h1>)", "Diluizione seriale", text)
print(text)
</code></pre>
<p><strong>说明</strong>:</p>
<p>正则表达式<code>(?<=<h1>)\w+ \w+(?=</h1>)</code>匹配包含在<code><h1></code>和<code></h1></code>之间的两个连续单词字符</p>
<p><strong>输出</strong>:</p>
<pre><code><! SOME CONTENT... >
<h1>Diluizione seriale</h1>
<p>Some content including "prima diluizione"...</p>
<h1>Diluizione seriale</h1>
<p>Some content including "seconda diluizione"...</p>
<h1>Diluizione seriale</h1>
<p>Some content including "terza diluizione"...</p>
</code></pre>