Selenium返回unknownProtocolFound

2024-10-02 08:28:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图使用FirefoxWebdriverSeleniumPython从[this GooglePlay direct link generator][1]下载APK文件。

问题是,当Selenium试图获取主页时,它会崩溃并显示以下错误消息:

/usr/bin/python2.7 /home/ghasemi/PycharmProjects/phorcys_watcher/main.py
http://apps.evozi.com/apk-downloader/?id=com.instagram.android
Traceback (most recent call last):
  File "/home/ghasemi/PycharmProjects/phorcys_watcher/main.py", line 7, in <module>
    content = google_play_download("com.instagram.android")
  File "/home/ghasemi/PycharmProjects/phorcys_watcher/collector.py", line 20, in google_play_download
    browser.get("https://apps.evozi.com/apk-downloader/?id=" + app_page_id)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 268, in get
    self.execute(Command.GET, {'url': url})
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 256, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: Reached error page: about:neterror?e=unknownProtocolFound&u=httpss%3A//www.adnetworkperformance.com/script/java.php%3Foption%3Drotateur%26r%3D411313%26treqn%3D1025813717%26runauction%3D1%26crr%3D168ce9d76b1a6695b12e%2CwcwHrNzGnshFns2PnM3bbcwGW8xLz-mNycwuvZjurZja3MzJfMxG_9xMX4wYns7a2YxHvshBL9xe3shbjN2J7umN6umNm-mNuN2czNw723956800778f24b2db6%26rtid%3D595b6ecb8ac19%26cbrandom%3D0.7519066097934798%26cbtitle%3DAPK%2520Downloader%2520%255BLatest%255D%2520Download%2520Directly%2520%257C%2520Chrome%2520Extension%2520v3%2520%28Evozi%2520Official%29%26cbiframe%3D0%26cbWidth%3D1280%26cbHeight%3D717%26cbdescription%3DDownload%2520APKs%2520Directly%2520From%2520Google%2520Play%2520To%2520Your%2520Computer%2520With%2520APK%2520Downloader%2520Extension%2520For%2520Google%2520Chrome%26cbkeywords%3D%26cbref%3D&c=&f=regular&d=Firefox%20doesn%E2%80%99t%20know%20how%20to%20open%20this%20address%2C%20because%20one%20of%20the%20following%20protocols%20%28httpss%29%20isn%E2%80%99t%20associated%20with%20any%20program%20or%20is%20not%20allowed%20in%20this%20context.

异常在此行引发: 浏览器.get(“https://apps.evozi.com/apk-downloader/?id=com.instagram.android”)

正如您在上面看到的,这个错误的来源是selenium试图下载它的页面中的一个错误链接。我找到了导致此错误的帧:

^{pr2}$

如您所见,网页开发人员错误地(两次)将httpss而不是{}放错了。

我怎么处理这个问题?

更新:

我的刮板:

import requests
from lxml import html
from pyvirtualdisplay import Display
from selenium import webdriver

def google_play_download(app_page_id):
    browser = webdriver.Firefox()
    browser.get("https://apps.evozi.com/apk-downloader/?id=" + app_page_id)
    browser.find_element_by_css_selector(".btn.btn-primary.btn-lg.btn-block").click()
    apk_link = browser.find_element_by_css_selector(".btn.btn-success.btn-block").get_attribute('href')
    browser.quit()
    for rnd in range(5):
        resp = requests.get(apk_link)
        if resp.headers['Content-Length'] == str(len(resp.content)):
            return resp.content


if __name__ == "__main__":
    content = google_play_download("com.instagram.android")
    f = open('./file', 'wb')
    f.write(content)
    f.close()

  [1]: https://apps.evozi.com/apk-downloader/

Tags: appsinpybrowsercomidgetselenium
3条回答

你就快到了。。。在

import requests
import time
from selenium import webdriver

def google_play_download(app_page_id):
    browser = webdriver.Chrome()
    browser.get("https://apps.evozi.com/apk-downloader/?id=" + app_page_id)
    browser.find_element_by_css_selector(".btn.btn-primary.btn-lg.btn-block").click()
    time.sleep(10)

    apk_link = browser.find_element_by_css_selector(".btn.btn-success.btn-block").get_attribute('href')
    browser.quit()
    for rnd in range(5):
        resp = requests.get(apk_link)
        if resp.headers['Content-Length'] == str(len(resp.content)):
            return resp.content


if __name__ == "__main__":
    content = google_play_download("com.instagram.android")
    f = open('file.apk', 'wb')
    f.write(content)
    f.close()

一个解决方案是一个url解析函数,您可以在之前调用每个url驱动程序。获取(url)

def url_parser(url):
    if 'httpss' in url:
        url = url.replace('httpss','https')
    return url

你就这样用吧

^{pr2}$

解析器无法在这行apk_link = browser.find_element_by_css_selector(".btn.btn-success.btn-block").get_attribute('href')中提取正确的url

就像打印url时显示的https://apps.evozi.com/apk-downloader/?id=com.instagram.android

把线路改成

apk_link = browser.find_element_by_css_selector(".btn.btn-success.btn-block")
ele=apk_link.get_attribute('href')

for rnd in range(5):
      resp = requests.get(ele)
      if resp.headers['Content-Length'] == str(len(resp.content)):
            return resp.content

代码将正常工作

相关问题 更多 >

    热门问题