仅使用BeautfulSoup提取特定文本

2024-09-27 00:22:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个很大的html课堂讲稿文件,我想用不同的定义、定理等来拆分。我已经做到了,但是当我使用.get_text()函数时,它会同时得到unicode和LaTeX代码,有没有(优雅的)拆分方法

示例:

定义的原始HTML:

'''

<div class="ltx_theorem ltx_theorem_definition" id="Ch1.S1.Thmtheorem1">
<h6 class="ltx_title ltx_runin ltx_title_theorem">
<span class="ltx_tag ltx_tag_theorem"><span class="ltx_text ltx_font_bold">Definition 1.1.1</span></span> (Groups: First definition).</h6>
<div class="ltx_para" id="Ch1.S1.Thmtheorem1.p1">
<p class="ltx_p">A <span class="ltx_text ltx_font_bold">group</span> <mjx-container class="MathJax CtxtMenu_Attached_0" ctxtmenu_counter="8" jax="CHTML" role="presentation" style="font-size: 101.1%; position: relative;" tabindex="0"><mjx-math aria-hidden="true" class="ltx_Math MJX-TEX" id="Ch1.S1.Thmtheorem1.p1.m1"><mjx-semantics><mjx-mrow><mjx-mo class="mjx-n"><mjx-c class="mjx-c28"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D43A TEX-I"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c class="mjx-c2C"></mjx-c></mjx-mo><mjx-mo class="mjx-n" space="2"><mjx-c class="mjx-c2218"></mjx-c></mjx-mo><mjx-mo class="mjx-n"><mjx-c class="mjx-c29"></mjx-c></mjx-mo></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml display="inline" role="presentation" unselectable="on"><math alttext="(G,\circ)" class="ltx_Math" display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mi>G</mi><mo>,</mo><mo>∘</mo><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(G,\circ)</annotation></semantics></math></mjx-assistive-mml></mjx-container> is a non-empty set <mjx-container class="MathJax CtxtMenu_Attached_0" ctxtmenu_counter="9" jax="CHTML" role="presentation" style="font-size: 101.1%; position: relative;" tabindex="0"><mjx-math aria-hidden="true" class="ltx_Math MJX-TEX" id="Ch1.S1.Thmtheorem1.p1.m2"><mjx-semantics><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D43A TEX-I"></mjx-c></mjx-mi></mjx-semantics></mjx-math><mjx-assistive-mml display="inline" role="presentation" unselectable="on"><math alttext="G" class="ltx_Math" display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>G</mi><annotation encoding="application/x-tex">G</annotation></semantics></math></mjx-assistive-mml></mjx-container> together with a binary operation <mjx-container class="MathJax CtxtMenu_Attached_0" ctxtmenu_counter="10" jax="CHTML" role="presentation" style="font-size: 101.1%; position: relative;" tabindex="0"><mjx-math aria-hidden="true" class="ltx_Math MJX-TEX" id="Ch1.S1.Thmtheorem1.p1.m3"><mjx-semantics><mjx-mo class="mjx-n"><mjx-c class="mjx-c2218"></mjx-c></mjx-mo></mjx-semantics></mjx-math><mjx-assistive-mml display="inline" role="presentation" unselectable="on"><math alttext="\circ" class="ltx_Math" display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mo>∘</mo><annotation encoding="application/x-tex">\circ</annotation></semantics></math></mjx-assistive-mml></mjx-container> – called the <span class="ltx_text ltx_font_bold">“group law”</span> – satisfying:</p>
<ol class="ltx_enumerate" id="Ch1.S1.I1">
<li class="ltx_item" id="Ch1.S1.I1.i1">
<div class="ltx_para" id="Ch1.S1.I1.i1.p1">
<p class="ltx_p"><span class="ltx_text ltx_font_bold">Closure</span>: <mjx-container class="MathJax CtxtMenu_Attached_0" ctxtmenu_counter="11" jax="CHTML" role="presentation" style="font-size: 101.1%; position: relative;" tabindex="0"><mjx-math aria-hidden="true" class="ltx_Math MJX-TEX" id="Ch1.S1.I1.i1.p1.m1"><mjx-semantics><mjx-mrow><mjx-mrow><mjx-mrow><mjx-mo class="mjx-n"><mjx-c class="mjx-c2200"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D465 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n"><mjx-c class="mjx-c2C"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="2"><mjx-c class="mjx-c1D466 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n" space="4"><mjx-c class="mjx-c2208"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="4"><mjx-c class="mjx-c1D43A TEX-I"></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml display="inline" role="presentation" unselectable="on"><math alttext="\forall x,y\in G" class="ltx_Math" display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mrow><mo>∀</mo><mi>x</mi></mrow><mo>,</mo><mi>y</mi></mrow><mo>∈</mo><mi>G</mi></mrow><annotation encoding="application/x-tex">\forall x,y\in G</annotation></semantics></math></mjx-assistive-mml></mjx-container>: <mjx-container class="MathJax CtxtMenu_Attached_0" ctxtmenu_counter="12" jax="CHTML" role="presentation" style="font-size: 101.1%; position: relative;" tabindex="0"><mjx-math aria-hidden="true" class="ltx_Math MJX-TEX" id="Ch1.S1.I1.i1.p1.m2"><mjx-semantics><mjx-mrow><mjx-mrow><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D465 TEX-I"></mjx-c></mjx-mi><mjx-mo class="mjx-n" space="3"><mjx-c class="mjx-c2218"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="3"><mjx-c class="mjx-c1D466 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n" space="4"><mjx-c class="mjx-c2208"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="4"><mjx-c class="mjx-c1D43A TEX-I"></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml display="inline" role="presentation" unselectable="on"><math alttext="x\circ y\in G" class="ltx_Math" display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mi>x</mi><mo>∘</mo><mi>y</mi></mrow><mo>∈</mo><mi>G</mi></mrow><annotation encoding="application/x-tex">x\circ y\in G</annotation></semantics></math></mjx-assistive-mml></mjx-container>.</p>
</div>
</li>
<li class="ltx_item" id="Ch1.S1.I1.i2">
<div class="ltx_para" id="Ch1.S1.I1.i2.p1">
<p class="ltx_p"><span class="ltx_text ltx_font_bold">Associativity</span>: <mjx-container class="MathJax CtxtMenu_Attached_0" ctxtmenu_counter="13" jax="CHTML" role="presentation" style="font-size: 101.1%; position: relative;" tabindex="0"><mjx-math aria-hidden="true" class="ltx_Math MJX-TEX" id="Ch1.S1.I1.i2.p1.m1"><mjx-semantics><mjx-mrow><mjx-mrow><mjx-mrow><mjx-mo class="mjx-n"><mjx-c class="mjx-c2200"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D465 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n"><mjx-c class="mjx-c2C"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="2"><mjx-c class="mjx-c1D466 TEX-I"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c class="mjx-c2C"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="2"><mjx-c class="mjx-c1D467 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n" space="4"><mjx-c class="mjx-c2208"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="4"><mjx-c class="mjx-c1D43A TEX-I"></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml display="inline" role="presentation" unselectable="on"><math alttext="\forall x,y,z\in G" class="ltx_Math" display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mrow><mo>∀</mo><mi>x</mi></mrow><mo>,</mo><mi>y</mi><mo>,</mo><mi>z</mi></mrow><mo>∈</mo><mi>G</mi></mrow><annotation encoding="application/x-tex">\forall x,y,z\in G</annotation></semantics></math></mjx-assistive-mml></mjx-container>: <mjx-container class="MathJax CtxtMenu_Attached_0" ctxtmenu_counter="14" jax="CHTML" role="presentation" style="font-size: 101.1%; position: relative;" tabindex="0"><mjx-math aria-hidden="true" class="ltx_Math MJX-TEX" id="Ch1.S1.I1.i2.p1.m2"><mjx-semantics><mjx-mrow><mjx-mrow><mjx-mrow><mjx-mo class="mjx-n"><mjx-c class="mjx-c28"></mjx-c></mjx-mo><mjx-mrow><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D465 TEX-I"></mjx-c></mjx-mi><mjx-mo class="mjx-n" space="3"><mjx-c class="mjx-c2218"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="3"><mjx-c class="mjx-c1D466 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n"><mjx-c class="mjx-c29"></mjx-c></mjx-mo></mjx-mrow><mjx-mo class="mjx-n" space="3"><mjx-c class="mjx-c2218"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="3"><mjx-c class="mjx-c1D467 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n" space="4"><mjx-c class="mjx-c3D"></mjx-c></mjx-mo><mjx-mrow space="4"><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D465 TEX-I"></mjx-c></mjx-mi><mjx-mo class="mjx-n" space="3"><mjx-c class="mjx-c2218"></mjx-c></mjx-mo><mjx-mrow space="3"><mjx-mo class="mjx-n"><mjx-c class="mjx-c28"></mjx-c></mjx-mo><mjx-mrow><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D466 TEX-I"></mjx-c></mjx-mi><mjx-mo class="mjx-n" space="3"><mjx-c class="mjx-c2218"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="3"><mjx-c class="mjx-c1D467 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n"><mjx-c class="mjx-c29"></mjx-c></mjx-mo></mjx-mrow></mjx-mrow></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml display="inline" role="presentation" unselectable="on"><math alttext="(x\circ y)\circ z=x\circ(y\circ z)" class="ltx_Math" display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mrow><mo stretchy="false">(</mo><mrow><mi>x</mi><mo>∘</mo><mi>y</mi></mrow><mo stretchy="false">)</mo></mrow><mo>∘</mo><mi>z</mi></mrow><mo>=</mo><mrow><mi>x</mi><mo>∘</mo><mrow><mo stretchy="false">(</mo><mrow><mi>y</mi><mo>∘</mo><mi>z</mi></mrow><mo stretchy="false">)</mo></mrow></mrow></mrow><annotation encoding="application/x-tex">(x\circ y)\circ z=x\circ(y\circ z)</annotation></semantics></math></mjx-assistive-mml></mjx-container>.</p>
</div>
</li>
<li class="ltx_item" id="Ch1.S1.I1.i3">
<div class="ltx_para" id="Ch1.S1.I1.i3.p1">
<p class="ltx_p"><span class="ltx_text ltx_font_bold">Existence of identity element</span>: <mjx-container class="MathJax CtxtMenu_Attached_0" ctxtmenu_counter="15" jax="CHTML" role="presentation" style="font-size: 101.3%; position: relative;" tabindex="0"><mjx-math aria-hidden="true" class="ltx_Math MJX-TEX" id="Ch1.S1.I1.i3.p1.m1"><mjx-semantics><mjx-mrow><mjx-mrow><mjx-mo class="mjx-n"><mjx-c class="mjx-c2203"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D452 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n" space="4"><mjx-c class="mjx-c2208"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="4"><mjx-c class="mjx-c1D43A TEX-I"></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml display="inline" role="presentation" unselectable="on"><math alttext="\exists e\in G" class="ltx_Math" display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo>∃</mo><mi>e</mi></mrow><mo>∈</mo><mi>G</mi></mrow><annotation encoding="application/x-tex">\exists e\in G</annotation></semantics></math></mjx-assistive-mml></mjx-container>, <mjx-container class="MathJax CtxtMenu_Attached_0" ctxtmenu_counter="16" jax="CHTML" role="presentation" style="font-size: 101.3%; position: relative;" tabindex="0"><mjx-math aria-hidden="true" class="ltx_Math MJX-TEX" id="Ch1.S1.I1.i3.p1.m2"><mjx-semantics><mjx-mrow><mjx-mrow><mjx-mo class="mjx-n"><mjx-c class="mjx-c2200"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D465 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n" space="4"><mjx-c class="mjx-c2208"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="4"><mjx-c class="mjx-c1D43A TEX-I"></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml display="inline" role="presentation" unselectable="on"><math alttext="\forall x\in G" class="ltx_Math" display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo>∀</mo><mi>x</mi></mrow><mo>∈</mo><mi>G</mi></mrow><annotation encoding="application/x-tex">\forall x\in G</annotation></semantics></math></mjx-assistive-mml></mjx-container>: <mjx-container class="MathJax CtxtMenu_Attached_0" ctxtmenu_counter="17" jax="CHTML" role="presentation" style="font-size: 101.3%; position: relative;" tabindex="0"><mjx-math aria-hidden="true" class="ltx_Math MJX-TEX" id="Ch1.S1.I1.i3.p1.m3"><mjx-semantics><mjx-mrow><mjx-mrow><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D452 TEX-I"></mjx-c></mjx-mi><mjx-mo class="mjx-n" space="3"><mjx-c class="mjx-c2218"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="3"><mjx-c class="mjx-c1D465 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n" space="4"><mjx-c class="mjx-c3D"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="4"><mjx-c class="mjx-c1D465 TEX-I"></mjx-c></mjx-mi><mjx-mo class="mjx-n" space="4"><mjx-c class="mjx-c3D"></mjx-c></mjx-mo><mjx-mrow space="4"><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D465 TEX-I"></mjx-c></mjx-mi><mjx-mo class="mjx-n" space="3"><mjx-c class="mjx-c2218"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="3"><mjx-c class="mjx-c1D452 TEX-I"></mjx-c></mjx-mi></mjx-mrow></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml display="inline" role="presentation" unselectable="on"><math alttext="e\circ x=x=x\circ e" class="ltx_Math" display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mi>e</mi><mo>∘</mo><mi>x</mi></mrow><mo>=</mo><mi>x</mi><mo>=</mo><mrow><mi>x</mi><mo>∘</mo><mi>e</mi></mrow></mrow><annotation encoding="application/x-tex">e\circ x=x=x\circ e</annotation></semantics></math></mjx-assistive-mml></mjx-container>.</p>
</div>
</li>
<li class="ltx_item" id="Ch1.S1.I1.i4">
<div class="ltx_para" id="Ch1.S1.I1.i4.p1">
<p class="ltx_p"><span class="ltx_text ltx_font_bold">Existence of inverses</span>: <mjx-container class="MathJax CtxtMenu_Attached_0" ctxtmenu_counter="18" jax="CHTML" role="presentation" style="font-size: 101.1%; position: relative;" tabindex="0"><mjx-math aria-hidden="true" class="ltx_Math MJX-TEX" id="Ch1.S1.I1.i4.p1.m1"><mjx-semantics><mjx-mrow><mjx-mrow><mjx-mo class="mjx-n"><mjx-c class="mjx-c2200"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D465 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n" space="4"><mjx-c class="mjx-c2208"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="4"><mjx-c class="mjx-c1D43A TEX-I"></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml display="inline" role="presentation" unselectable="on"><math alttext="\forall x\in G" class="ltx_Math" display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo>∀</mo><mi>x</mi></mrow><mo>∈</mo><mi>G</mi></mrow><annotation encoding="application/x-tex">\forall x\in G</annotation></semantics></math></mjx-assistive-mml></mjx-container>, <mjx-container class="MathJax CtxtMenu_Attached_0" ctxtmenu_counter="19" jax="CHTML" role="presentation" style="font-size: 101.1%; position: relative;" tabindex="0"><mjx-math aria-hidden="true" class="ltx_Math MJX-TEX" id="Ch1.S1.I1.i4.p1.m2"><mjx-semantics><mjx-mrow><mjx-mrow><mjx-mo class="mjx-n"><mjx-c class="mjx-c2203"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D466 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n" space="4"><mjx-c class="mjx-c2208"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="4"><mjx-c class="mjx-c1D43A TEX-I"></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml display="inline" role="presentation" unselectable="on"><math alttext="\exists y\in G" class="ltx_Math" display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo>∃</mo><mi>y</mi></mrow><mo>∈</mo><mi>G</mi></mrow><annotation encoding="application/x-tex">\exists y\in G</annotation></semantics></math></mjx-assistive-mml></mjx-container>: <mjx-container class="MathJax CtxtMenu_Attached_0" ctxtmenu_counter="20" jax="CHTML" role="presentation" style="font-size: 101.1%; position: relative;" tabindex="0"><mjx-math aria-hidden="true" class="ltx_Math MJX-TEX" id="Ch1.S1.I1.i4.p1.m3"><mjx-semantics><mjx-mrow><mjx-mrow><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D465 TEX-I"></mjx-c></mjx-mi><mjx-mo class="mjx-n" space="3"><mjx-c class="mjx-c2218"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="3"><mjx-c class="mjx-c1D466 TEX-I"></mjx-c></mjx-mi></mjx-mrow><mjx-mo class="mjx-n" space="4"><mjx-c class="mjx-c3D"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="4"><mjx-c class="mjx-c1D452 TEX-I"></mjx-c></mjx-mi><mjx-mo class="mjx-n" space="4"><mjx-c class="mjx-c3D"></mjx-c></mjx-mo><mjx-mrow space="4"><mjx-mi class="mjx-i"><mjx-c class="mjx-c1D466 TEX-I"></mjx-c></mjx-mi><mjx-mo class="mjx-n" space="3"><mjx-c class="mjx-c2218"></mjx-c></mjx-mo><mjx-mi class="mjx-i" space="3"><mjx-c class="mjx-c1D465 TEX-I"></mjx-c></mjx-mi></mjx-mrow></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml display="inline" role="presentation" unselectable="on"><math alttext="x\circ y=e=y\circ x" class="ltx_Math" display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mi>x</mi><mo>∘</mo><mi>y</mi></mrow><mo>=</mo><mi>e</mi><mo>=</mo><mrow><mi>y</mi><mo>∘</mo><mi>x</mi></mrow></mrow><annotation encoding="application/x-tex">x\circ y=e=y\circ x</annotation></semantics></math></mjx-assistive-mml></mjx-container>.
</p>
</div>
</li>
</ol>
</div>
</div>

'''

然后运行.get_text()我们得到:

Definition 1.1.1 (Groups: First definition).

A group (G,∘)(G,\circ) is a non-empty set GG together with a binary operation ∘\circ – called the “group law” – satisfying:

Closure: ∀x,y∈G\forall x,y\in G: x∘y∈Gx\circ y\in G.

Associativity: ∀x,y,z∈G\forall x,y,z\in G: (x∘y)∘z=x∘(y∘z)(x\circ y)\circ z=x\circ(y\circ z).

Existence of identity element: ∃e∈G\exists e\in G, ∀x∈G\forall x\in G: e∘x=x=x∘ee\circ x=x=x\circ e.

Existence of inverses: ∀x∈G\forall x\in G, ∃y∈G\exists y\in G: x∘y=e=y∘xx\circ y=e=y\circ x.

想要的输出:

Definition 1.1.1 (Groups: First definition).

A group (G,\circ) is a non-empty set G together with a binary operation \circ – called the “group law” – satisfying:

Closure: \forall x,y\in G: x\circ y\in G.

Associativity: \forall x,y,z\in G: (x\circ y)\circ z=x\circ(y\circ z).

Existence of identity element: \exists e\in G, \forall x\in G: ee\circ x=x=x\circ e.

Existence of inverses: \forall x\in G, \exists y\in G: x\circ y=e=y\circ x.

最初,我将文本保存到一个文件中,然后手动编辑它,我正在寻找一个更优雅的解决方案,因为这只是我第二次使用bs4模块,不知道在文档中查找什么


Tags: incontainerspacemathclassroletexmo
1条回答
网友
1楼 · 发布于 2024-09-27 00:22:11

我的方法(没有任何外部库)将遵循以下思路:

import bs4
import re
html = '''
#Insert the long HTML provided above
'''

els = soup.find_all('p')
for el in els:
   string = re.sub('[ \n]+', ' ', el.text).strip()
   print(string)

基本上,您要查找所有段落元素(其中存储了相应的文本元素),对它们进行迭代,并逐个删除(取决于您希望删除的格式-使用正则表达式)最后,打印字符串

相关问题 更多 >

    热门问题