Python要求所有东西都加倍

2024-05-03 08:01:15 发布

您现在位置:Python中文网/ 问答频道 /正文

所以我正在尝试制作一个程序来获取Spotify个人资料图片。我可以得到图片的网址,但问题是每个网址有2个

import requests
from bs4 import BeautifulSoup


list = ["https://open.spotify.com/user/0n7zzdkxmt0ldpo1kqugwca67",
        "https://open.spotify.com/user/1l23d3k5yq2v9ey191zp8uqxr",
]

for i in list:
    response = requests.get(i)

    html_content = response.content

    soup = BeautifulSoup(html_content, "html.parser")
    for i in soup.find_all("div",{"class":"bg lazy-image"}):
        print(i.get("data-src"))

这就是结果:

https://i.scdn.co/image/ab6775700000ee85202880a205b627a7e6f25659
https://i.scdn.co/image/ab6775700000ee85202880a205b627a7e6f25659
https://i.scdn.co/image/ab6775700000ee85da40dde3363ed185d5e48a0a
https://i.scdn.co/image/ab6775700000ee85da40dde3363ed185d5e48a0a

Process finished with exit code 0

我的问题是,如果它们是相同的,我如何才能只打印其中一个


Tags: httpsimageimportcomhtml图片opencontent
3条回答

在这种情况下,只需将iterable转换为

    for i in set(soup.find_all("div",{"class":"bg lazy-image"})):
       print(i.get("data-src"))

这样,iterable中的所有重复项都将被删除

我强烈建议大家阅读Python的数据结构:

https://docs.python.org/3/tutorial/datastructures.html

我会将它们转换为一组,以删除重复项:

divs = soup.find_all("div",{"class":"bg lazy-image"})
urls = set(d.get('data-src') for d in divs) 

一个简单的解决方案就是检查URL是否等于最后一个URL

import requests
from bs4 import BeautifulSoup


list = ["https://open.spotify.com/user/0n7zzdkxmt0ldpo1kqugwca67",
        "https://open.spotify.com/user/1l23d3k5yq2v9ey191zp8uqxr",
]

for i in list:
    response = requests.get(i)
    html_content = response.content

    url = None
    soup = BeautifulSoup(html_content, "html.parser")
    for i in soup.find_all("div",{"class":"bg lazy-image"}):
        if i.get("data-src") != url:
            url = i.get("data-src")
            print(url)

相关问题 更多 >