擅长:python、mysql、java
<p>如果您不介意使用xpath,这应该可以</p>
<pre><code>import requests
from lxml import html
url = "http://www.radisson.com/lansing-hotel-mi-48933/lansing/hotel/dining"
page = requests.get(url).text
tree = html.fromstring(page)
xp_t = "//*[@class='copy_left']/descendant-or-self::node()/strong[not(following-sibling::a)]/text()"
xp_d = "//*[@class='copy_left']/descendant-or-self::node()/strong[not(following-sibling::a)]/../text()[not(following-sibling::strong)]"
titles = tree.xpath(xp_t)
descriptions = tree.xpath(xp_d) # still contains garbage like '\r\n'
descriptions = [d.strip() for d in descriptions if d.strip()]
for t, d in zip(titles, descriptions):
print("{title}: {description}".format(title=t, description=d))
</code></pre>
<p>这里的描述包含3个元素:“这个市中心…”,“为了一个杯子…”,“如果你喜欢…”。你知道吗</p>
<p>如果您还需要“When you are the mood…”,请替换为:</p>
<pre><code>xp_d = "//*[@class='copy_left']/descendant-or-self::node()/strong[not(following-sibling::a)]/../text()"
</code></pre>