BeautifulSoup,用于在Python中爬网具有或不具有ID的表

2024-04-18 19:17:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图抓取网站,他们都有表格。但是,第一个url有一个名为.table-translations的表ID,而另一个没有ID,因此不会爬网

但如果我不包括它,它就不会爬行

如何使用BeautifulSoup在有表ID和没有表ID的情况下对数据进行爬网

下面是我的代码

import requests
from bs4 import BeautifulSoup


urls = ['http://www.mongols.eu/mongolian-language/mongolian-tale-six-silver-stars', 'http://www.mongols.eu/mongolian-language/mongolian-tale-yanzin-jaal']

for url in urls:
        print(url)
        out_fileName = url.rsplit('/', 1)[-1]
        out_mn = out_fileName + "_mn.txt"
        out_en = out_fileName + "_en.txt"

        soup = BeautifulSoup(requests.get(url).content, 'html.parser')

        all_data = []
        for row in soup.select('.table-translations tr')[1:]:
                mongolian, english = map(lambda t: t.get_text(strip=True), row.select('td')[1:])
                all_data.append((mongolian, english))

        for row in all_data:
                with open(out_mn, "a") as text_file:
                        text_file.write(row[0] + "\n")
                with open(out_en, "a") as text_file:
                        text_file.write(row[1] + "\n")

1条回答
网友
1楼 · 发布于 2024-04-18 19:17:18

此脚本将从这两个URL获取所有翻译。但如果有其他不同结构的页面,则需要调整:

import requests
from bs4 import BeautifulSoup


urls = ['http://www.mongols.eu/mongolian-language/mongolian-tale-six-silver-stars', 'http://www.mongols.eu/mongolian-language/mongolian-tale-yanzin-jaal']

for url in urls:
    print(url)

    soup = BeautifulSoup(requests.get(url).content, 'html.parser')

    all_data = []
    for row in soup.select('tr')[1:]:
        tds = [*map(lambda t: t.get_text(strip=True), row.select('td'))]
        if len(tds) == 3:
            mongolian, english = map(lambda t: t.get_text(strip=True), row.select('td')[1:])
        else:
            mongolian, english = map(lambda t: t.get_text(strip=True), row.select('td'))

        print(mongolian)
        print(english)
        print('-' * 80)
        all_data.append((mongolian, english))

印刷品:

http://www.mongols.eu/mongolian-language/mongolian-tale-six-silver-stars
Зургаан мөнгөн мичид
Six silver stars
                                        
Эрт урьд цагт зургаан өнчин хүүхэд товцог толгой дээр наадан суудаг юм санжээ.
Long ago, there were six orphan brothers playing on the top of a hill.
                                        

... and so on.

相关问题 更多 >