无法使用python从html内容获取链接

1条回答

网友

1楼 · 发布于 2024-10-02 14:26:51

这不是小事。k参数值“隐藏”在嵌套iframes中的script元素的深处。下面是一个^{}+^{}获取k值的方法：

import re
from urlparse import urljoin
# Python 3: from urllib.parse import urljoin

import requests
from bs4 import BeautifulSoup

base_url = "http://www.protect-stream.com"
with requests.Session() as session:
    response = session.get("http://www.protect-stream.com/PS_DL_xODN4o5HjLuqzEX5fRNuhtobXnvL9SeiyYcPLcqaqqXayD8YaIvg9Qo80hvgj4vCQkY95XB7iqcL4aF1YC8HRg_i_i")

    # get the top frame url
    soup = BeautifulSoup(response.content, "html.parser")
    src = soup.select_one('iframe[src^="frame.php"]')["src"]
    frame_url = urljoin(base_url, src)

    # get the nested frame url
    response = session.get(frame_url)
    soup = BeautifulSoup(response.content, "html.parser")
    src = soup.select_one('iframe[src^="w.php"]')["src"]
    frame_url = urljoin(base_url, src)

    # get the frame HTML source and extract the "k" value
    response = session.get(frame_url)
    soup = BeautifulSoup(response.content, "html.parser")
    script = soup.find("script", text=lambda text: text and "k=" in text).get_text(strip=True)

    k_value = re.search(r'var k="(.*?)";', script).group(1)
    print(k_value)

印刷品：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章

无法使用python从html内容获取链接

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >