FeedParser，删除特殊字符并写入CSV - 问答 - Python中文网

FeedParser，删除特殊字符并写入CSV

2024-09-23 04:21:56 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我在学Python。我给自己定了一个小目标，就是建立一个RSS刮板。我正在收集作者、链接和标题。从那里我想写一个CSV。在

我遇到了一些问题。我从昨晚开始就在寻找答案，但似乎找不到解决办法。我确实有一种感觉，在feedparser正在解析什么和将它移动到CSV之间我缺少一些知识，但是我还没有足够的词汇表来知道Google要做什么。在

如何删除特殊字符，如“[”和“”？在
在创建新文件时，如何为新行写入作者、链接和标题？在

1个特殊字符

rssurls = 'http://feeds.feedburner.com/TechCrunch/'

techart = feedparser.parse(rssurls)
# feeds = []

# for url in rssurls:
#     feedparser.parse(url)
# for feed in feeds:
#     for post in feed.entries:
#         print(post.title)

# print(feed.entires)

techdeets = [post.author + " , " + post.title + " , " + post.link  for post in techart.entries]
techdeets = [y.strip() for y in techdeets]
techdeets

Output：我获得了所需的信息，但是.strip标记没有剥离。在

['Darrell Etherington , Spin launches first city-sanctioned dockless bike sharing in Bay Area , http://feedproxy.google.com/~r/Techcrunch/~3/BF74UZWBinI/', 'Ryan Lawler , With $5.3 million in funding, CarDash wants to change how you get your car serviced , http://feedproxy.google.com/~r/Techcrunch/~3/pkamfdPAhhY/', 'Ron Miller , AlienVault plug-in searches for stolen passwords on Dark Web , http://feedproxy.google.com/~r/Techcrunch/~3/VbmdS0ODoSo/', 'Lucas Matney , Firefox for Windows gets native WebVR support, performance bumps in latest update , http://feedproxy.google.com/~r/Techcrunch/~3/j91jQJm-f2E/',...]

2）写入CSV

^{pr2}$

输出：输出是一个只有1行多列的数据帧。在

Tags： csv in com http for feed google 作者

1条回答

网友

1楼 · 发布于 2024-09-23 04:21:56

你就快到了：-）

使用pandas先创建一个数据帧，然后保存它，类似于“从代码继续”这样的操作：

df = pd.DataFrame(columns=['author', 'title', 'link'])
for i, post in enumerate(techart.entries):
    df.loc[i] = post.author, post.title, post.link

然后您可以保存它：

^{pr2}$

或

也可以直接从feedparser条目写入数据帧：

>>> import feedparser
>>> import pandas as pd
>>>
>>> rssurls = 'http://feeds.feedburner.com/TechCrunch/'
>>> techart = feedparser.parse(rssurls)
>>>
>>> df = pd.DataFrame()
>>>
>>> df['author'] = [post.author for post in techart.entries]
>>> df['title'] = [post.title for post in techart.entries]
>>> df['link'] = [post.link for post in techart.entries]

相关问题更多 >

编程相关推荐

热门问题

热门文章