擅长:python、mysql、java
<p>要解析html并查找所需的数据,应该使用<code>BeautifulSoup</code>库。你知道吗</p>
<p>在URL的html中,有一个带有作者的<code>meta</code>标记:</p>
<pre><code><meta content="By MAGGIE HABERMAN, MICHAEL D. SHEAR and GLENN THRUSH" name="byl"/>
</code></pre>
<p>因此,要检查是否有作者,您需要通过其名称(<code>byl</code>)找到它:</p>
<pre><code>import requests
from bs4 import BeautifulSoup
s = "https://www.nytimes.com/2017/08/18/us/politics/steve-bannon-trump-white-house.html?hp&action=click&pgtype=Homepage&clickSource=story-heading&module=a-lede-package-region&region=top-news&WT.nav=top-news"
def checkForAuthor():
soup = BeautifulSoup(requests.get(s).content, 'html.parser')
meta = soup.find('meta', {'name': 'byl'})
return meta is not None
</code></pre>
<p>实际上,您还可以通过<code>meta["content"]</code>获得作者名称</p>