如何使用Google Cloud API获取给定bucket中的文件夹列表问题的回答

如何使用Google Cloud API获取给定bucket中的文件夹列表

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

<p>我还需要简单地列出一个桶的内容。理想情况下，我想要类似于tf.gfile提供的东西。gfile支持确定条目是文件还是目录。</p> <p>我尝试了上面@jterrace提供的各种链接，但结果并不理想。这说明它值得展示结果。</p> <p>给定一个包含“目录”和“文件”的bucket，很难在“文件系统”中找到感兴趣的项。我在代码中提供了一些注释上面引用的代码是如何工作的。</p> <p>在这两种情况下，我使用的是一个包含凭据的datalab笔记本。给定结果，我将需要使用字符串解析来确定哪些文件在特定目录中。如果有人知道如何扩展这些方法或其他方法来解析类似tf.gfile的目录，请回复。</p> <h2>方法一</h2> <pre><code>import sys import json import argparse import googleapiclient.discovery BUCKET = 'bucket-sounds' def create_service(): return googleapiclient.discovery.build('storage', 'v1') def list_bucket(bucket): """Returns a list of metadata of the objects within the given bucket.""" service = create_service() # Create a request to objects.list to retrieve a list of objects. fields_to_return = 'nextPageToken,items(name,size,contentType,metadata(my-key))' #req = service.objects().list(bucket=bucket, fields=fields_to_return) # returns everything #req = service.objects().list(bucket=bucket, fields=fields_to_return, prefix='UrbanSound') # returns everything. UrbanSound is top dir in bucket #req = service.objects().list(bucket=bucket, fields=fields_to_return, prefix='UrbanSound/FREE') # returns the file FREESOUNDCREDITS.TXT #req = service.objects().list(bucket=bucket, fields=fields_to_return, prefix='UrbanSound/FREESOUNDCREDITS.txt', delimiter='/') # same as above #req = service.objects().list(bucket=bucket, fields=fields_to_return, prefix='UrbanSound/data/dog_bark', delimiter='/') # returns nothing req = service.objects().list(bucket=bucket, fields=fields_to_return, prefix='UrbanSound/data/dog_bark/', delimiter='/') # returns files in dog_bark dir all_objects = [] # If you have too many items to list in one request, list_next() will # automatically handle paging with the pageToken. while req: resp = req.execute() all_objects.extend(resp.get('items', [])) req = service.objects().list_next(req, resp) return all_objects # usage print(json.dumps(list_bucket(BUCKET), indent=2)) </code></pre> <p>这会产生如下结果：</p> <pre><code>[ { "contentType": "text/csv", "name": "UrbanSound/data/dog_bark/100032.csv", "size": "29" }, { "contentType": "application/json", "name": "UrbanSound/data/dog_bark/100032.json", "size": "1858" } stuff snipped] </code></pre> <h2>方法二</h2> <pre><code>import re import sys from google.cloud import storage BUCKET = 'bucket-sounds' # Create a Cloud Storage client. gcs = storage.Client() # Get the bucket that the file will be uploaded to. bucket = gcs.get_bucket(BUCKET) def my_list_bucket(bucket_name, limit=sys.maxsize): a_bucket = gcs.lookup_bucket(bucket_name) bucket_iterator = a_bucket.list_blobs() for resource in bucket_iterator: print(resource.name) limit = limit - 1 if limit <= 0: break my_list_bucket(BUCKET, limit=5) </code></pre> <p>这会产生这样的输出。</p> <pre><code>UrbanSound/FREESOUNDCREDITS.txt UrbanSound/UrbanSound_README.txt UrbanSound/data/air_conditioner/100852.csv UrbanSound/data/air_conditioner/100852.json UrbanSound/data/air_conditioner/100852.mp3 </code></pre>

如何使用Google Cloud API获取给定bucket中的文件夹列表

1 个回答

相关Python问题