<p>最简单的事情是通过API,但也可以通过<code><script></code>标记来完成。并非所有属性都具有“全屏”属性:</p>
<p><strong>带有<code><script></code>标记:</strong></p>
<pre><code>import os
import requests
from bs4 import BeautifulSoup, Tag
import json
def getResponse(url):
while True:
try:
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')
return soup
except:
print("retrying...")
url = "https://www.propertyfinder.ae/en/rent/apartment-for-rent-dubai-dubai-marina-botanica-tower-7469382.html"
soup = getResponse(url)
script = soup.find_all("script")
jsonStr = script[7].text.split('payload: ')[-1].split(';')[0].rsplit('}',1)[0]
val = json.loads(jsonStr)
properties = val['included']
for prop in properties:
if 'links' in prop.keys():
if 'full_screen' in prop['links'].keys():
print (prop['links']['full_screen'])
</code></pre>
<p><strong>使用API:</strong></p>
<pre><code>import requests
url = 'https://www.propertyfinder.ae/en/api/search'
payload = {
'er[category_id]': '2',
'filter[furnished]': '0',
'filter[locations_ids][]': '3037',
'filter[price_type]': 'y',
'filter[property_type_id]': '1',
'page[limit]': '9999',
'page[number]': '1',
'sort': 'mr',
'include': 'properties,properties.property_type,properties.property_images,properties.location_tree,properties.agent,properties.broker,smart_ads,smart_ads.agent,smart_ads.broker,smart_ads.property_type,smart_ads.property_images,smart_ads.location_tree,direct_from_developer,direct_from_developer.property_type,direct_from_developer.property_images,direct_from_developer.location_tree,direct_from_developer.agent,direct_from_developer.broker,cts,cts.agent,cts.broker,cts.property_type,cts.property_images,cts.location_tree,similar_properties,similar_properties.agent,similar_properties.broker,similar_properties.property_type,similar_properties.property_images,similar_properties.location_tree,agent_smart_ads,agent_smart_ads.broker,agent_smart_ads.languages,agent_properties_smart_ads,agent_properties_smart_ads.agent,agent_properties_smart_ads.broker,agent_properties_smart_ads.location_tree,agent_properties_smart_ads.property_type'}
val = requests.get(url, params=payload).json()
properties = val['included']
for prop in properties:
if 'links' in prop.keys():
if 'full_screen' in prop['links'].keys():
print (prop['links']['full_screen'])
</code></pre>
<p><strong>输出:</strong></p>
<pre><code>https://www.propertyfinder.ae/property/eaefddc999df314f589016fbb9df0c1e/1312/894/MODE/62d644/7468078-5a70co.jpg
https://www.propertyfinder.ae/property/9dae7d23cc50000baa36f55cde632fec/1312/894/MODE/b59dc8/7468078-d15a9o.jpg
https://www.propertyfinder.ae/property/2aaaa29b083099436f8dea3d018ba0f0/1312/894/MODE/94e390/7468078-84666o.jpg
https://www.propertyfinder.ae/property/a6ab186660629c4b6d494c2e66bd2b71/1312/894/MODE/97a052/7468078-eb879o.jpg
....
</code></pre>