我一直致力于检索twitter的实时数据,我想把tweet的某些字段提取到csv,其中一行表示tweet;tweetid和tweet文本。一切都很正常,直到我将tweet文本附加到csv中,突然有多条tweet被插入到tweet文本单元格中
我打印了tweetid、计数器和文本,每行打印出12000条记录,非常精细。
但是,在csv文件中,由于这个问题,我丢失了200条记录。我添加了一个计数器来识别和跟踪我丢失200条记录的地方。我被困了几个小时。有人能帮我找出哪里不对吗
这是我的代码:
with client:
with open('data.csv', 'w', newline='', encoding='utf-8', errors='ignore') as file:
fw = csv.writer(file, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
fw.writerow(['CollectionId', 'Counter', 'Text'])
print('starting to append data to csv...')
counter = 0
print('appending Streaming data...')
for stream in Streaming:
streamTime = stream["created_at"]
parseTime = dateutil.parser.parse(streamTime)
# CollectionId
if stream["id_str"]:
collectionId = "'" + stream["id_str"] + "'"
counter = counter + 1
# cleanup text - to display in a single line
streamText = stream["text"]
streamText = streamText.split('\n') #remove new lines in text
streamText = " ".join(streamText)
streamText = streamText.replace(',', ' ') #replace commas in text
#print("{}, {} - {}".format(collectionId, counter, streamText))
fw.writerow([collectionId, counter, streamText])
print("Streaming Data has been exported to 'data.csv'")
目前没有回答
相关问题 更多 >
编程相关推荐