Python BeautifulSoup类find返回None

from bs4 import BeautifulSoup JAGS7_result = requests.get("https://agsjournals.onlinelibrary.wiley.com/toc/15325415/2021/69/7") JAGS7_soup = BeautifulSoup(JAGS7_result.text, "html.parser") results = JAGS7_soup.find_all("div",{"class": "issue-item"}) print(results)```

3条回答

网友

1楼 · 编辑于 2024-06-02 13:12:25

我还建议你在将来的工作中使用蜘蛛和刮痧。这是一个很棒的抓取包，因为beautifulsoup通常无法在JavaScript网站上成功

网友

2楼 · 编辑于 2024-06-02 13:12:25

您的http响应未成功。它收到403个不允许的响应

检查

print(JAGS7_result.status_code)

应该是200。你的情况是403

使用请求头来解决此问题

h = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36'}
JAGS7_result = requests.get("https://agsjournals.onlinelibrary.wiley.com/toc/15325415/2021/69/7", headers=h)

现在你得到了你想要的结果

网友

3楼 · 编辑于 2024-06-02 13:12:25

在请求期间尝试设置User-Agent头：

import requests
from bs4 import BeautifulSoup

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:90.0) Gecko/20100101 Firefox/90.0"
}

JAGS7_result = requests.get(
    "https://agsjournals.onlinelibrary.wiley.com/toc/15325415/2021/69/7",
    headers=headers,
)
JAGS7_soup = BeautifulSoup(JAGS7_result.text, "html.parser")

for title in JAGS7_soup.select("a > h2"):
    print(title.text)

印刷品：

Cover
Issue Information
A glimmer of hope for the most vulnerable
Emergency department visits for emergent conditions among older adults during the COVID-19 pandemic
SARS-CoV-2 antibody detection in skilled nursing facility residents
VA home-based primary care interdisciplinary team structure varies with Veterans' needs, aligns with PACE regulation
Emergency visits by older adults decreased during COVID-19 but increased in the oldest old
Teaching geriatrics during the COVID-19 pandemic: Aquifer Geriatrics to the rescue
Changes in medication use among long-stay residents with dementia in Michigan during the pandemic
Reduction in respiratory viral infections among hospitalized older adults during the COVID-19 pandemic

...

相关问题更多 >

编程相关推荐

热门问题

热门文章