擅长:python、mysql、java
<p>从网页中抓取数据的一个关键部分是查看网页的HTML源以正确地获取数据。在您提供的链接中,有以下几行包含作者信息。你知道吗</p>
<pre><code><meta name="author" content="Maggie Haberman, Michael D. Shear and Glenn Thrush" />
<meta name="byl" content="By MAGGIE HABERMAN, MICHAEL D. SHEAR and GLENN THRUSH" />
<meta property="article:author" content="https://www.nytimes.com/by/maggie-haberman" />
<meta property="article:author" content="https://www.nytimes.com/by/michael-d-shear" />
<meta property="article:author" content="https://www.nytimes.com/by/glenn-thrush" />
</code></pre>
<p>还有其他的,但这些应该会有所帮助。要解析这些标记,可以使用<a href="https://www.crummy.com/software/BeautifulSoup/bs4/doc/" rel="nofollow noreferrer">beautiful-soup</a>。你知道吗</p>