BS4找不到tex

2024-09-27 19:26:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在打印这篇文章https://i.imgur.com/SLl1URt.png “我用过”汤。全部找到(“p”,class=“review”)”并尝试使用.getText或check-inside.contents,但都不起作用

网页链接https://m.wuxiaworld.co/Castle-of-Black-Iron/

下面是一些调试信息https://i.imgur.com/0k6NHeD.png

import urllib2
from bs4 import BeautifulSoup

def info(novelname):
    user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7'
    url = "https://m.wuxiaworld.co/"+novelname+"/"
    headers={'User-Agent':user_agent,'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8','Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
       'Accept-Encoding': 'none',
       'Accept-Language': 'en-US,en;q=0.8',
       'Connection': 'keep-alive'}
    request = urllib2.Request(url, headers=headers)
    response = urllib2.urlopen(request)
    soup = BeautifulSoup(response, features="html.parser")
    for textp in soup.find_all("p", class_="review"):
        print textp.contents
        print textp
        print textp.getText()

Tags: httpscompngcontentsurllib2reviewclassen
2条回答

当你打印你的汤,你会看到一些html标签在终端(不是所有的源代码)。我认为网站隐藏了一部分数据。所以呢我建议使用硒。 如果您尚未下载,可以安装在:

https://chromedriver.storage.googleapis.com/index.html?path=2.35/

所有代码:

from selenium import webdriver

driver_path = r'your driver path'
browser = webdriver.Chrome(executable_path=driver_path)


browser.get("https://m.wuxiaworld.co/Castle-of-Black-Iron/")

x = browser.find_elements_by_css_selector("p[class='review']") ## Declare which class
for text1 in x:
    print text1.text
browser.close()

输出:

Description After the Catastrophe, every rule in the world was rewritten. In the Age of Black Iron, steel, iron, steam engines and fighting force became the crux in which human beings depended on to survive. A commoner boy by the name Zhang Tie was selected by the gods of fortune and was gifted a small tree which could constantly produce various marvelous fruits. At the same time, Zhang Tie was thrown into the flames of war, a three-hundred-year war between the humans and monsters on the vacant continent. Using crystals to tap into the potentials of the human body, one must cultivate to become stronger. The thrilling legends of mysterious clans, secrets of Oriental fantasies, numerous treasures and legacies in the underground world — All in the Castle of Black Iron! Citadel of Black Iron 黑铁之堡

import requests
from bs4 import BeautifulSoup
from collections import OrderedDict

def info(novelname):        
    response = requests.get(
        'https://m.wuxiaworld.co/{}/'.format(novelname.replace(' ', '-')),
        headers=OrderedDict(
            (
                ("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7"),
                ("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"),
                ("Accept-Language", "en-US,en;q=0.5"),
                ("Accept-Encoding", "gzip, deflate"),
                ("Connection", "keep-alive"), 
                ("Upgrade-Insecure-Requests", "1")
            )
        )
    )

    if response.status_code == 200:
        soup = BeautifulSoup(response.content, 'html5lib')

        for textp in soup.find_all('p', attrs={'class': 'review'}):
            print textp.text.strip()

info('Castle of Black Iron')

问题是你的html解析器。。。使用html5lib

Description

After the Catastrophe, every rule in the world was rewritten.

In the Age of Black Iron, steel, iron, steam engines and fighting force became the crux in which human beings depended on to survive.

A commoner boy by the name Zhang Tie was selected by the gods of fortune and was gifted a small tree which could constantly produce various marvelous fruits. At the same time, Zhang Tie was thrown into the flames of war, a three-hundred-year war between the humans and monsters on the vacant continent. Using crystals to tap into the potentials of the human body, one must cultivate to become stronger.

The thrilling legends of mysterious clans, secrets of Oriental fantasies, numerous treasures and legacies in the underground world — All in the Castle of Black Iron!

Citadel of Black Iron
黑铁之堡

相关问题 更多 >

    热门问题