python输出数据文件的GCP云函数

#!/usr/bin/env python # -*- coding: utf-8 -*- import tweepy import datetime import csv def fetchTweets(username): # credentials from https://apps.twitter.com/ consumerKey = "" # hidden for security reasons consumerSecret = "" # hidden for security reasons accessToken = "" # hidden for security reasons accessTokenSecret = "" # hidden for security reasons auth = tweepy.OAuthHandler(consumerKey, consumerSecret) auth.set_access_token(accessToken, accessTokenSecret) api = tweepy.API(auth) startDate = datetime.datetime(2019, 1, 1, 0, 0, 0) endDate = datetime.datetime.now() print (endDate) tweets = [] tmpTweets = api.user_timeline(username) for tweet in tmpTweets: if tweet.created_at < endDate and tweet.created_at > startDate: tweets.append(tweet) lastid = "" while (tmpTweets[-1].created_at > startDate and tmpTweets[-1].id != lastid): print("Last Tweet @", tmpTweets[-1].created_at, " - fetching some more") lastid = tmpTweets[-1].id tmpTweets = api.user_timeline(username, max_id = tmpTweets[-1].id) for tweet in tmpTweets: if tweet.created_at < endDate and tweet.created_at > startDate: tweets.append(tweet) # # for CSV #transform the tweepy tweets into a 2D array that will populate the csv outtweets = [[tweet.id_str, tweet.created_at, tweet.text.encode("utf-8")] for tweet in tweets] #write the csv with open('%s_tweets.csv' % username, 'w', newline='') as f: writer = csv.writer(f) writer.writerow(["id","created","text"]) writer.writerows(outtweets) pass f = open('%s_tweets.csv' % username, "r") contents = f.read() return contents fetchTweets('usernameofusertoretrieve') # this will be set manually in production

1条回答

网友

1楼 · 发布于 2024-10-17 06:21:53

如果没有更多细节，你的问题很难回答。但是，我将尝试提供一些见解

is GCP Cloud Functions the correct tool for the job? or will this require something more extensive and therefore a GCP VM instance?

视情况而定。使用1个CPU时，您的处理持续时间是否少于9分钟？您的进程占用的内存是否少于2Gb（应用程序内存占用+文件大小+数组大小）

为什么文件大小如此之大？因为只有/tmp目录是可写的，并且它是内存中的文件系统

如果需要最多15分钟的超时时间，可以查看Cloud Run，这与云函数和I personally prefer非常相似。云功能和云运行在CPU和内存方面的限制是相同的（但随着CPU和内存的增加，应该在2020年改变）

What would need to be changed in the code to make it run on GCP?

从向/tmp目录写入和读取开始。最后，如果您希望您的文件整天都可用，请将其存储在云存储（https://cloud.google.com/storage/docs）中，并在函数开始时检索它。如果不存在，则为当天生成，否则获取现有的

然后，将函数def fetchTweets(username):的签名替换为def fetchTweets(request):，并在请求参数中获取用户名

最后，如果你想每天都有一代人，就建立一个Cloud Scheduler

你没说安全问题。我建议您在private mode中部署您的函数

所以，在这个答案中有很多GCP无服务器的概念，我不知道你对GCP的了解。如果你想知道某些零件的精度，请不要犹豫

相关问题更多 >

编程相关推荐

热门问题

热门文章