使phanthomjs webdriver停止加载CSS内容[python crawler]

driver = webdriver.PhantomJS() driver.command_executor._commands['executePhantomScript'] = ('POST', '/session/$sessionId/phantom/execute') driver.execute('executePhantomScript', {'script': ''' var page = this; page.onResourceRequested = function(requestData, request) { if ((/http:\/\/.+?\.css/gi).test(requestData['https://www.whatismyip.com/']) || requestData.headers['Content-Type'] == 'text/css') { console.log('The url of the request is matching. Aborting: ' + requestData['https://www.whatismyip.com/']); request.abort(); } ''', 'args': []}) driver.get("https://www.whatismyip.com/") ipaddress=driver.find_element_by_xpath("//div[@class='ip']").text print ipaddress driver.quit()

1条回答

网友

1楼 · 发布于 2024-06-23 19:58:45

您正在针对requestData['https://www.whatsmyip.com/']测试regex，我假设是null，这是通过使用requestData.url按照the documentation来修复的。另外，请求将不包含Content-Type，因此可以删除此条件。在

我选择简化您的正则表达式，因为有些url可能使用SSL或relative提供服务，并且与http://不匹配。我将使用一个$锚点来测试URL末尾的.css（不需要使用g全局修饰符，因为您只查找一个匹配项）。在

最后的.onResourceRequested回调可能包含如下条件：

if(/\.css$/i.test(requestData.url)) {
    request.abort();
}

相关问题更多 >

编程相关推荐

热门问题

热门文章