<p>似乎您正在获取的属性是<code>href</code>,您正在尝试刮取的<code><img></code>标记没有<code>attribute</code>,它有<code>src</code>{<cd3>},这就是链接所在的位置。顺便说一句,将<code>html</code>参数放在您提供的长<code>html</code>代码中</p>
<pre><code>def queryNewBalance(html):
#r = requests.get('https://www.newbalance.com/men/shoes/basketball/?prefn1=color&prefv1=Black%7CBlue&srule=null')
soup = BeautifulSoup(html, 'html.parser')
result = soup.find_all('div', class_='product w-100')
for res in result:
print("*******************************")
print(res.find('img', class_='tile-image ls-is-cached lazyloaded')['src']) #Picture
print("*******************************")
print(f"\nFound total shoes: {len(result)}")
queryNewBalance(html)
</code></pre>
<p>输出</p>
<pre><code>*******************************
https://nb.scene7.com/is/image/NB/bbomnlwb_nb_02_i?$pdpflexf2$&wid=440&hei=440
*******************************
Found total shoes: 1
[Finished in 0.7s]
</code></pre>
<p>-<strong>URL</strong>-</p>
<pre><code>from bs4 import BeautifulSoup
import requests
def queryNewBalance():
r = requests.get('https://www.newbalance.com/men/shoes/basketball/?prefn1=color&prefv1=Black%7CBlue&srule=null')
soup = BeautifulSoup(r.content, 'html.parser')
result = soup.find_all('div', class_='product w-100')
for res in result:
print("*******************************")
print(res.find('img', class_='tile-image')["data-src"]) #Picture
print("*******************************")
print(f"\nFound total shoes: {len(result)}")
queryNewBalance()
</code></pre>
<p>输出:</p>
<pre><code>*******************************
https://nb.scene7.com/is/image/NB/bbomnxbb_nb_02_i?$pdpflexf2$&wid=440&hei=440
*******************************
*******************************
https://nb.scene7.com/is/image/NB/bbomnlpl_nb_02_i?$pdpflexf2$&wid=440&hei=440
*******************************
*******************************
https://nb.scene7.com/is/image/NB/bbomnlwb_nb_02_i?$pdpflexf2$&wid=440&hei=440
*******************************
*******************************
https://nb.scene7.com/is/image/NB/bbomnlbr_nb_02_i_5a34b3da900d437a9a88?$pdpflexf2$&wid=440&hei=440
*******************************
*******************************
https://nb.scene7.com/is/image/NB/bbomnlfc_nb_02_i?$pdpflexf2$&wid=440&hei=440
*******************************
*******************************
https://nb.scene7.com/is/image/NB/bbomnlwt_nb_02_i?$pdpflexf2$&wid=440&hei=440
*******************************
Found total shoes: 6
[Finished in 2.9s]
</code></pre>
<p>附言:
如果您更多地参与到web抓取中,并且抓取大量的网站,尤其是大型网站,我建议您将解析器更改为<code>html5lib</code>-><code>pip install html5lib</code>。它是一个更好的解析器,因为我在抓取<code>html.parser</code>时遇到了问题,它只是没有以某种方式抓取网站的某些部分,尽管我检查了soup对象的位置,不管怎样,你的呼叫,祝你好运</p>