使用多个标记美化组，每个标记都有一个特定的类问题的回答

使用多个标记美化组，每个标记都有一个特定的类

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

<p>您可以将<code>re.compile</code>对象与<code>soup.find_all</code>一起使用：</p> <pre><code>import re from bs4 import BeautifulSoup as soup html = """ <table> <tr style='width:40%'> <td style='align:top'></td> </tr> </table> """ results = soup(html, 'html.parser').find_all(re.compile('td|tr'), {'style':re.compile('width:40%|align:top')}) </code></pre> <p>输出：</p> ^{pr2}$ <p>通过提供<code>re.compile</code>对象来指定所需的标记和<code>style</code>值，<code>find_all</code>将返回<code>tr</code>或{<cd7>}标记的任何实例，该标记包含<code>width:40%</code>或{<cd10>}的内联<code>style</code>属性。在</p> <p>此方法可以通过提供多个属性值来推断元素：</p> <pre><code>html = """ <table> <tr style='width:40%'> <td style='align:top' class='get_this'></td> <td style='align:top' class='ignore_this'></td> </tr> </table> """ results = soup(html, 'html.parser').find_all(re.compile('td|tr'), {'style':re.compile('width:40%|align:top'), 'class':'get_this'}) </code></pre> <p>输出：</p> <pre><code>[<td class="get_this" style="align:top"></td>] </code></pre> <p>编辑2：简单递归解决方案：</p> <pre><code>import bs4 from bs4 import BeautifulSoup as soup def get_tags(d, params): if any((lambda x:b in x if a == 'class' else b == x)(d.attrs.get(a, [])) for a, b in params.get(d.name, {}).items()): yield d for i in filter(lambda x:x != '\n' and not isinstance(x, bs4.element.NavigableString) , d.contents): yield from get_tags(i, params) html = """ <table> <tr style='align:top'> <td style='width:40%'></td> <td style='align:top' class='ignore_this'></td> </tr> </table> """ print(list(get_tags(soup(html, 'html.parser'), {'td':{'style':'width:40%'}, 'tr':{'style':'align:top'}}))) </code></pre> <p>输出：</p> <pre><code>[<tr style="align:top"> <td style="width:40%"></td> <td class="ignore_this" style="align:top"></td> </tr>, <td style="width:40%"></td>] </code></pre> <p>递归函数使您能够为某些标记提供自己的字典所需的目标属性：此解决方案尝试将任何指定属性与传递给函数的<code>bs4</code>对象相匹配，如果发现匹配，则元素为<code>yield</code>ed</p>

使用多个标记美化组，每个标记都有一个特定的类

1 个回答

相关Python问题