我想通过CSV写这个输出
['https://www.lendingclub.com/loans/personal-loans' '6.16% to 35.89%'] ['https://www.lendingclub.com/loans/personal-loans' '1% to 6%'] ['https://www.marcus.com/us/en/personal-loans' '6.99% to 24.99%'] ['https://www.marcus.com/us/en/personal-loans' '6.99% to 24.99%'] ['https://www.marcus.com/us/en/personal-loans' '6.99% to 24.99%'] ['https://www.marcus.com/us/en/personal-loans' '6.99% to 24.99%'] ['https://www.marcus.com/us/en/personal-loans' '6.99% to 24.99%'] ['https://www.discover.com/personal-loans/' '6.99% to 24.99%']
但是,当我运行代码将输出写入CSV时,我只得到写入CSV文件的最后一行:
['https://www.discover.com/personal-loans/' '6.99% to 24.99%']
可能是因为我的打印输出不是逗号分隔的吗?我试图通过使用空格作为分隔符来避免在其中添加逗号。让我知道你的想法。我希望能在这方面得到一些帮助,因为我正在最艰难的时间重塑这些收集到的数据
plcompetitors = ['https://www.lendingclub.com/loans/personal-loans',
'https://www.marcus.com/us/en/personal-loans',
'https://www.discover.com/personal-loans/']
#cycle through links in array until it finds APR rates/fixed or variable using regex
for link in plcompetitors:
cdate = datetime.date.today()
l = r.get(link)
l.encoding = 'utf-8'
data = l.text
soup = bs(data, 'html.parser')
#captures Discover's rate perfectly but catches too much for lightstream/prosper
paragraph = soup.find_all(text=re.compile('[0-9]%'))
for n in paragraph:
matches = re.findall('(?i)\d+(?:\.\d+)?%\s*(?:to|-)\s*\d+(?:\.\d+)?%', n.string)
try:
irate = str(matches[0])
array = np.asarray(irate)
array2 = np.append(link,irate)
array2 = np.asarray(array2)
print(array2)
#with open('test.csv', "w") as csv_file:
# writer = csv.writer(csv_file, delimiter=' ')
# for line in test:
# writer.writerow(line)
except IndexError:
pass
当谈到使用csv文件时,pandas很方便
我已将每个链接及其irate值存储在数据帧
df2
中,并将其连接到父数据帧df
。 最后,我将父数据帧df
写入csv文件我认为问题是您正在以写模式打开文件(在
open('test.csv', "w")
中的"w"
),这意味着Python会覆盖文件中已经写入的内容。我想你在找附加模式:如果这样不行,请告诉我
相关问题 更多 >
编程相关推荐