函数输出至文本/CSV文件？

import urllib2,sys,os,csv from bs4 import BeautifulSoup,NavigableString from string import punctuation as p from multiprocessing import Pool import re, nltk import requests import math, functools import summarize reload(sys) def processURL_short(l): open_url = urllib2.urlopen(l).read() item_soup = BeautifulSoup(open_url) item_div = item_soup.find('div',{'id':'transcript'},{'class':'displaytext'}) item_str = item_div.text.lower() return item_str every_link_test = ['http://www.millercenter.org/president/obama/speeches/speech-4427', 'http://www.millercenter.org/president/obama/speeches/speech-4424', 'http://www.millercenter.org/president/obama/speeches/speech-4453', 'http://www.millercenter.org/president/obama/speeches/speech-4612', 'http://www.millercenter.org/president/obama/speeches/speech-5502'] data = {} count = 0 for l in every_link_test: content_1 = processURL_short(l) for word in content_1.split(): word = word.strip(p) if word in contractions: count = count + 1 splitlink = l.split("/") president = splitlink[4] speech_num = splitlink[-1] filename = "{0}_{1}".format(president,speech_num) data[filename] = count print count, filename with open('contraction_counts.csv','w',newline='') as fp: a = csv.writer(fp,delimiter = ',') a.writerows(data)

2条回答

网友

1楼 · 编辑于 2024-09-30 00:40:04

您的问题是以w模式打开循环中的输出文件，这意味着每次迭代都会删除它。您可以通过两种方式轻松解决：

将open置于循环之外（正常方式）。您将只打开一次文件，在每次迭代中添加一行，并在退出with块时关闭它：

with open('contraction_counts.csv','w',newline='') as fp:
    a = csv.writer(fp,delimiter = ',')
    for l in every_link_test:
        content_1 = processURL_short(l)
        for word in content_1.split():
            word = word.strip(p)
            if word in contractions:
                count = count + 1
            splitlink = l.split("/")
            president = splitlink[4]
            speech_num = splitlink[-1]
            filename = "{0}_{1}".format(president,speech_num)
        data[filename] = count
        print count, filename
        a.writerows(data)

以a（append）模式打开文件。每次迭代时，您都要重新打开文件并在末尾写入，而不是将其删除-由于打开/关闭，这种方式会使用更多的IO资源，并且只有在程序可能中断并且您希望确保在崩溃之前写入的所有内容都已实际保存到磁盘上时才使用

for l in every_link_test:
    content_1 = processURL_short(l)
    for word in content_1.split():
        word = word.strip(p)
        if word in contractions:
            count = count + 1
        splitlink = l.split("/")
        president = splitlink[4]
        speech_num = splitlink[-1]
        filename = "{0}_{1}".format(president,speech_num)
    data[filename] = count
    print count, filename

    with open('contraction_counts.csv','a',newline='') as fp:
        a = csv.writer(fp,delimiter = ',')
        a.writerows(data)

网友

2楼 · 编辑于 2024-09-30 00:40:04

你可以试试这样的方法，这是一个通用的方法，可以根据需要修改

import csv
with open('somepath/file.txt', 'wb+') as outfile:
  w = csv.writer(outfile)
  w.writerow(['header1', 'header2'])
  for i in you_data_structure: # eg list or dictionary i'm assuming a list structure
    w.writerow([
      i[0],
      i[1],
    ])

或者如果是字典

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章