如何使用python以编程方式从google drive下载特定文件

2024-10-01 09:28:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我在谷歌硬盘的不同文件夹中有大约10万个文件。我想从中下载特定的文件。谷歌硬盘中的文件路径在csv中

但是我怎样才能得到文件的ID呢?我尝试了以下方法

import pandas as pd
from apiclient import errors
#from pygdrive3 import service


def retrieve_all_files(service):
  """Retrieve a list of File resources.

  Args:
    service: Drive API service instance.
  Returns:
    List of File resources.
  """
  result = []
  page_token = None
  while True:
    try:
      param = {}
      if page_token:
        param['pageToken'] = page_token
      files = service.files().list(**param).execute()

      #result.extend(files['items'])
      idval = files.get('id')
      if not idval:
        break
    except errors.HttpError.error:
      print ('An error occurred: %s' % error)
      break
  return idval


df = pd.read_csv("/home/ram/Downloads/Data_Science/Kaggle Competition/BBox_List_2017_path_colab.csv",header=None)
print(df.head())
for i in df[0]:
    request = drive_service.files()
    result = retrieve_all_files(request)
    fh = io.BytesIO()
    downloader = MediaIoBaseDownload(fh, request)
    done = False
    while done is False:
        status, done = downloader.next_chunk()
        print ("Download %d%%." % int(status.progress() * 100))

但是错误是:drive_service is not defined。下面是我的csv

                                                   0           1  ...           4            5
0  /content/drive/My Drive/nihxray/images_001/ima...  225.084746  ...   79.186441  Atelectasis
1  /content/drive/My Drive/nihxray/images_001/ima...  686.101695  ...  313.491525  Atelectasis
2  /content/drive/My Drive/nihxray/images_001/ima...  221.830508  ...  216.949153  Atelectasis
3  /content/drive/My Drive/nihxray/images_001/ima...  726.237288  ...   55.322034  Atelectasis
4  /content/drive/My Drive/nihxray/images_001/ima...  660.067797  ...   78.101695  Atelectasis

我只下载了上面csv格式的文件。我怎样才能用python来做呢?有什么帮助吗


Tags: 文件csvimportmyservicepagefilesdrive
2条回答

有一种更容易理解的方法。安装Python和Gam后,您可以运行一个脚本,在csv文件中使用google drive中的文件id导出列表中的所有文档。一旦安装了python和gam,就需要安装一些模块,脚本才能正常工作。运行脚本时,可以通过谷歌搜索错误代码,查看Python中需要安装的内容。此外,您还需要创建一个api凭据服务帐户,并在脚本中的两个位置用替换该帐户名。使用脚本名为script.py的以下命令以管理员身份运行cmd。“C:\Users\dcahoon\AppData\Local\Programs\Python\Python38\Python.exe C:\GAM\SCRIPT.PY**SCRIPT start

import os
import subprocess

from csv import writer
from csv import reader

# path to googleidlist.csv
csvfile = 'c:\\GAM\\googleidlist.csv'
destination = 'c:\\GAM\\OUTPUT\\'      #Destination for downloaded documents


# Open the input_file in read mode and output_file in write mode
with open(csvfile, 'r') as read_obj, \
        open('output_1.txt', 'w', newline='') as write_obj:
    # Create a csv.reader object from the input file object
    csv_reader = reader(read_obj)
    # Create a csv.writer object from the output file object
    csv_writer = writer(write_obj)
    # Read each row of the input csv file as list
    for row in csv_reader:
         file_id = row[0]
        outcome = subprocess.Popen(['gam', 'user', 'googleserviceaccountname', 'get', 'drivefile', 'id', file_id, 'targetfolder',destination], stdout=subprocess.PIPE)
        # os.system("gam user david.bruinsma@colonialmed.com show fileinfo "+ file_id + "name")
        filename = subprocess.Popen(['gam', 'user', 'googleserviceaccountname', 'show', 'fileinfo', file_id, 'name' ], stdout=subprocess.PIPE)
        output = outcome.stdout.readline()
        file_name = filename.stdout.readline()
        print(output)
        # Append the default text in the row / list
        # row.append(filename)
        row.append(output)
        row.append(file_name)
        row.append(file_id)

        # Add the updated row / list to the output file
        csv_writer.writerow(row)

以下是来自异步Google API客户端的两个片段,这可能更适合您,因为它允许您同时下载多个文件:

列出文件(按ID):https://github.com/omarryhan/aiogoogle/blob/master/examples/list_drive_files.py

下载文件:https://github.com/omarryhan/aiogoogle/blob/master/examples/download_drive_file.py

相关问题 更多 >