通过URL从excel下载web图像并保存到Python中的文件夹中

2024-06-26 14:26:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个excel文件,如下所示:

import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
pd.options.display.max_colwidth

df = pd.read_excel("./test.xlsx")
print(df)

输出:

^{pr2}$

我需要通过读取和迭代列imgUrl来下载图像,并将图像保存到路径combinebycolumnscity, buildingName, buildingId, imgType.

最终输出文件夹和子文件夹的结构如下所示,它们将保存在名为output的文件夹中:

├── bj
│   └── LG tower_123456
│       ├── inside
│       │   └── 827780_144001014_2.jpg
│       └── outside
│           └── 20140321160157-391052318.jpg
├── gz
│   └── GM_123458
│       └── inside
│           └── 2634566_104534032717_2.jpg
├── sh
│   └── LXD_123457
│       └── inside
│           └── 20120720_fb680a57416b8d16bad2kO1kOUIzkNxO.jpg

我怎么能用Python做到这一点呢?提前谢谢你的帮助。在

我尝试下载一张图片:

import requests

r = requests.get("http://pic1.to8to.com/case/day_120720/20120720_fb680a57416b8d16bad2kO1kOUIzkNxO.jpg")
if r.status_code == 200:
    with open("test.jpg", "wb") as f:
        f.write(r.content)

Tags: test图像import文件夹dfasdisplayexcel
3条回答
import pandas as pd
import requests


def download_urls(csv_path):
    df = pd.read_csv(csv_path,encoding='utf-8',error_bad_lines=False)
    for index, row in df.iterrows():
        folder  = row[0]
        sub_folder = row[1]
        url = row[3]
        r = requests.get(url)
        if r.status_code == 200:
            with open("/{0}/{1}/{2}".format(folder, sub_folder, url.split("/")[-1]), "wb") as f:
                f.write(r.content)

path = r"C:\path\your_csv_path"
download_urls(path)

假设您有csv文件作为输入,那么没有优雅的方式用pandas迭代行,所以您可以使用csv库

import pandas as pd
import requests
import os

def download_urls(csv_path):
    df = pd.read_csv(csv_path,encoding='utf-8',error_bad_lines=False)
    for index, row in df.iterrows():
        folder  = row[0]
        sub_folder = row[1]
        url = row[3]
        r = requests.get(url)
        if r.status_code == 200:
            if not os.path.exists(folder):
                os.makedirs(folder)
                if not os.path.exists(sub_folder):
                    os.makedirs(sub_folder)

            with open("/{0}/{1}/{2}".format(folder, sub_folder, url.split("/")[-1]), "wb") as f:
                f.write(r.content)

path = r"C:\path\your_csv_path"
download_urls(path)

如果不存在打开文件夹,请尝试此操作(只运行“打开目录”)

假设加载了数据帧,可以执行类似的操作。在

    import requests
    from os.path import join
    for index, row in df.iterrows():
        url = row['url']
        file_name = url.split('/')[-1]
        r = requests.get(url)
        abs_file_name = join(row['city'],row['buildingName']+str(row['buildingId']),row['imgType'],file_name)
        if r.status_code == 200:
            with open(abs_file_name, "wb") as f:
                f.write(r.content)

编辑代码:

^{pr2}$

相关问题 更多 >