擅长:python、mysql、java
<p>你必须使用selenium将javascript从网页加载到html
然后使用selenium的滚动代码</p>
<pre><code>import requests
from bs4 import BeautifulSoup
from selenium import webdriver
import pandas as pd
import time
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome('/home/sush/Downloads/Compressed/chromedriver_linux64/chromedriver')
driver.get('https://indianrecipes.com/new_and_popular')
heights = []
counter = 0
for i in range(1,300):
bg = driver.find_element_by_css_selector('body')
time.sleep(0.1)
bg.send_keys(Keys.END)
heights.append(driver.execute_script("return document.body.scrollHeight"))
try :
bottom = heights[i-16]
except:
pass
if i%16 ==0:
new_bottom = heights[i-1]
if bottom == new_bottom:
break
</code></pre>
<p>然后使用beauthousoup从</p>
<p><code>soup = BeautifulSoup(driver.page_source, 'lxml')</code></p>