非类型对象没有属性使用Beauty Soup查找所有错误

import requests from bs4 import BeautifulSoup URL = 'https://www.cvbankas.lt/?miestas=Vilnius&padalinys%5B%5D=&keyw=python' page = requests.get(URL).text soup = BeautifulSoup(page, 'html.parser') results = soup.find(id='ResultsContainer') # Look for Python jobs python_jobs = results.find_all("div", string=lambda t: "python" in t.lower()) for p_job in python_jobs: link = p_job.find("h3")["href"] print(p_job.text.strip()) print(f"Apply here: {link}\n")

3条回答

网友

1楼 · 编辑于 2024-09-27 18:00:41

签出我的代码：

import requests
from bs4 import BeautifulSoup
URL = 'https://www.cvbankas.lt/?miestas=Vilnius&padalinys%5B%5D=&keyw=python'
page = requests.get(URL).text
soup = BeautifulSoup(page, 'html.parser')
h3_tags = soup.findAll("h3", {"class": "list_h3"})
for x in h3_tags:
    if "Python" in x.text:
        print(x.text)
        print(x.find_parent('a')['href'])
        print()

输出为：

Senior Python Developer
https://www.cvbankas.lt/senior-python-developer-vilniuje/1-6719819

Full Stack Engineer (React + Python)
https://www.cvbankas.lt/full-stack-engineer-react-python-vilniuje/1-6665723

Python programuotojas (Mid-Senior)
https://www.cvbankas.lt/python-programuotojas-mid-senior-vilniuje/1-6693547

Python Developer
https://www.cvbankas.lt/python-developer-vilniuje/1-6604883

网友

2楼 · 编辑于 2024-09-27 18:00:41

您的问题是没有id为"ResultsContainer"的元素

但是参考页面的结构，您可以使用css selector直接获取所有信息：

import requests
from bs4 import BeautifulSoup

URL = 'https://www.cvbankas.lt/?miestas=Vilnius&padalinys%5B%5D=&keyw=python'
page = requests.get(URL).text
soup = BeautifulSoup(page, 'html.parser')
results = soup.select("div.list_cell > .list_h3")
for i in results:
    print(i.text)

结果:

Data Engineer
Data Analyst
VYRESNYSIS INŽINIERIUS STRATEGIJOS IR TYRIMŲ SKYRIUJE
Senior Python Developer
Full Stack Engineer (React + Python)
DevOps Engineer
Linux Systems Automation Engineer
Big Data Developer
Big Data Devops Engineer
Python programuotojas (Mid-Senior)
DATA SCIENTIST
DEVOPS INŽINIERIAUS (e-commerce platformos produktų optimizavimas užsienio rinkoms)
LINUX Sistemų administratorius (-ė)
QA engineer
Blockchain Developer
Backend Software Engineer
FW/HW Quality Assurance Engineer
Software developer in Test
Python Developer
Senior Backend Engineer

网友

3楼 · 编辑于 2024-09-27 18:00:41

问题是，没有任何带有id="ResultsContainer"的标记。您可以使用文本Python搜索所有<h3>标记，然后查找url的父<a>标记：

import requests
from bs4 import BeautifulSoup


URL = 'https://www.cvbankas.lt/?miestas=Vilnius&padalinys%5B%5D=&keyw=python'
page = requests.get(URL).text
soup = BeautifulSoup(page, 'html.parser')

results = soup.find_all('h3', text=lambda t: 'python' in t.lower())
for r in results:
    print(r.text)
    print(r.find_parent('a')['href'])
    print('-' * 80)

印刷品：

Senior Python Developer
https://www.cvbankas.lt/senior-python-developer-vilniuje/1-6719819
--------------------------------------------------------------------------------
Full Stack Engineer (React + Python)
https://www.cvbankas.lt/full-stack-engineer-react-python-vilniuje/1-6665723
--------------------------------------------------------------------------------
Python programuotojas (Mid-Senior)
https://www.cvbankas.lt/python-programuotojas-mid-senior-vilniuje/1-6693547
--------------------------------------------------------------------------------
Python Developer
https://www.cvbankas.lt/python-developer-vilniuje/1-6604883
--------------------------------------------------------------------------------

相关问题更多 >

编程相关推荐

热门问题

热门文章