<p>这是因为结果是分页的。您需要在以下位置获取json数据中定义的所有页面:</p>
<pre><code>data['content']['page']['voltron_unified_search_json']['search']['baseData']['resultPagination']['pages']
</code></pre>
<p><code>pages</code>是一个列表,对于公司<code>2409087</code>它是:</p>
^{pr2}$
<p>这基本上是一个URL列表,您需要通过它来获取数据。在</p>
<p>以下是您需要执行的操作(为登录编辑代码):</p>
<pre><code>def get_results(json_code):
return json_code['content']['page']['voltron_unified_search_json']['search']['results']
url = "https://www.linkedin.com/vsearch/p?f_CC=2409087"
soup = BeautifulSoup(s.get(url).text)
code = soup.find('code', id="voltron_srp_main-content").contents[0].replace(r'\u002d', '-')
json_code = json.loads(code)
results = get_results(json_code)
pages = json_code['content']['page']['voltron_unified_search_json']['search']['baseData']['resultPagination']['pages']
for page in pages[1:]:
soup = BeautifulSoup(s.get(page['pageURL']).text)
code = soup.find('code', id="voltron_srp_main-content").contents[0].replace(r'\u002d', '-')
json_code = json.loads(code)
results += get_results(json_code)
print len(results)
</code></pre>
<p>它为<a href="https://www.linkedin.com/vsearch/p?f_CC=2409087" rel="nofollow">https://www.linkedin.com/vsearch/p?f_CC=2409087</a>打印<code>25</code>这正是您在浏览器中看到的内容。在</p>