用靓汤刮Youtube验证徽章的实例?

2024-09-30 08:20:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我一直试图通过youtube视频链接检查频道/上传器是否经过验证(蓝色徽章)。Youtube API似乎没有这方面的功能,所以我一直在尝试使用BeautifulSoup进行刮取。以下是我尝试过的:

from bs4 import BeautifulSoup
import requests

url = "https://www.youtube.com/watch?v=" + video_id
source = requests.get(url).text
bs = BeautifulSoup(source, 'lxml')
    
# does not work
bs.find_all("div", {"class": "badge badge-style-type-verified style-scope ytd-badge-supported-renderer"})

我试图跟踪导致ytd-badge类的HTML元素的层次结构,通过检查发现:

html->;正文->;年初至今应用程序->#内容->#页面管理器->;年初至今的flexy手表->#列->#初级->;div#primary-inner.style-scope.ytd-watch-flexy->#元->#元内容->;ytd-video-secondary-info-renderer.style-scope.ytd-watch-flexy->#集装箱->;分区#顶行.style-scope.ytd-video-secondary-info-renderer->;年初至今视频所有者渲染器->;div.#upload-info.style-scope.ytd-video-owner-renderer->#频道名称->;ytd-badge-supported-renderer.style-scope.ytd-channel-name

它很长而且很疯狂,所以我想知道我怎么能访问它?有没有一种更简单的方法可以使用Python实现这一点?谢谢


Tags: badgegtdivinfo视频youtubestylevideo
1条回答
网友
1楼 · 发布于 2024-09-30 08:20:08

YouTube使用JavaScript,所以使用Requests-HTML来刮取页面

使用pip install requests-html安装它

由于网页上有多个视频可以包含徽章,我们需要检查包含徽章的类(badge badge-style-type-verified style-scope ytd-badge-supported-renderer)是否存在于频道的信息类(style-scope ytd-video-owner-renderer)下

from requests_html import HTMLSession
from bs4 import BeautifulSoup

video_id = ""
video_url = "https://www.youtube.com/watch?v=" + video_id
# Initialize an HTML Session
session = HTMLSession()
# Get the html content
response = session.get(video_url)
# Execute JavaScript
response.html.render(sleep=3)

soup = BeautifulSoup(response.html.html, "lxml")

# Find the channel info class
channel_info = soup.select_one('.style-scope ytd-video-owner-renderer')

# Check if the class that contains the verified badge exists in the channel info class
if channel_info.find('div', class_='badge badge-style-type-verified style-scope ytd-badge-supported-renderer'):
    print('Verified')
else:
    print('NOT verified!')

相关问题 更多 >

    热门问题