用BeatifulSoup解析HTML中的文本块（IndustryAbout）

Commodities: Copper, Nickel, Platinum, Palladium, Gold Area: Lappi Type: Copper Concentrator Plant Annual Production: 17,200 tonnes of Copper (2015), 8,800 tonnes of Nickel (2015), 31,900 tonnes of Platinum, 25,100 ounces of Palladium, 12,800 ounces of Gold (2015) Owner: Kevitsa Mining Oy Shareholders: Boliden AB (100%) Activity since: 2012

import requests from bs4 import BeautifulSoup import re page = requests.get("https://www.industryabout.com/country-territories-3/2199-finland/copper-mining/34519-kevitsa-copper-concentrator-plant") soup = BeautifulSoup(page.content, 'lxml') rows = soup.select("strong") for r in rows: print(r)

import requests from bs4 import BeautifulSoup import re import csv links = ["34519-kevitsa-copper-concentrator-plant", "34520-kevitsa-copper-mine", "34356-glogow-copper-refinery"] for l in links: page = requests.get("https://www.industryabout.com/country-territories-3/2199-finland/copper-mining/"+l) soup = BeautifulSoup(page.content, 'lxml') rows = soup.select("strong") d = {} for r in rows: name, value, *rest = r.text.split(":") if not rest: d[name] = value print(d)

1条回答

网友

1楼 · 发布于 2024-10-02 18:17:08

这是你想要的吗

import requests
from bs4 import BeautifulSoup

page = requests.get("https://www.industryabout.com/country-territories-3/2199-finland/copper-mining/34519-kevitsa-copper-concentrator-plant")
soup = BeautifulSoup(page.content, 'html.parser')

rows = soup.select("strong")
d = {}
for r in rows:
    name, value, *rest = r.text.split(":")
    if not rest: # links or scripts have more ":" probably not intesting for you
        d[name] = value
print(d)

相关问题更多 >

编程相关推荐

热门问题

热门文章