如何写一个新的列到csv当网页浏览？

import os, requests, csv from bs4 import BeautifulSoup # Get URL of the page URL = ('https://www.tripadvisor.com/Attraction_Review-g294265-d2149128-Reviews-Gardens_by_the_Bay-Singapore.html') with open('GardensbytheBay.csv', 'w', newline='') as csvfile: writer = csv.writer(csvfile) # Looping until the 5th page of reviews for pagecounter in range(3): # Request get the first page res = requests.get(URL) res.raise_for_status # Download the html of the first page soup = BeautifulSoup(res.text, "html.parser") # Match it to the specific tag for all 5 ratings reviewElems = soup.findAll('img', {'class': ['sprite-rating_s_fill rating_s_fill s50', 'sprite-rating_s_fill rating_s_fill s40', 'sprite-rating_s_fill rating_s_fill s30', 'sprite-rating_s_fill rating_s_fill s20', 'sprite-rating_s_fill rating_s_fill s10']}) reviewWritten = soup.findAll('p', {'class':'partial_entry'}) if reviewElems: for row, rows in zip(reviewElems, reviewWritten): review_text = row.attrs['alt'][0] review2_text = rows.get_text(strip=True).encode('utf8', 'ignore').decode('latin-1') writer.writerow([review_text]) writer.writerow([review2_text]) print('Writing page', pagecounter + 1) else: print('Could not find clue.') # Find URL of next page and update URL if pagecounter == 0: nextLink = soup.select('a[data-offset]')[0] elif pagecounter != 0: nextLink = soup.select('a[data-offset]')[1] URL = 'http://www.tripadvisor.com' + nextLink.get('href') print('Download complete')

2条回答

网友

1楼 · 编辑于 2024-10-03 06:31:27

您可以使用熊猫数据帧：

import pandas as pd
import numpy as np
csv_file = pd.read_csv('GardensbytheBay.csv')
csv_file.insert(idx, cloname, value)
csv_input.to_csv('output.csv', index=False)

网友

2楼 · 编辑于 2024-10-03 06:31:27

您可以将评审分数和文本放在同一行但不同的列中：

writer.writerow([review_text, review2_text])

您最初的方法是将每个项目作为一个单独的行，并连续地编写它们，这不是您想要的。在

相关问题更多 >

编程相关推荐

热门问题

热门文章