替换特定单词

2024-09-27 18:02:54 发布

您现在位置:Python中文网/ 问答频道 /正文

这个脚本从许多新闻网站上获取标题,并计算标题中出现单词的次数。你知道吗

我得到了像“to”、“for”这样的词,还有一些类似的词,我不想在这个脚本中抓住它们。你知道吗

我试着写一本书str.translate公司(无,“to”)删除“to”一词,但它删除了“greedily”——抢走了华盛顿的部分地区,而我只想让它删除“to”一词。你知道吗

import pprint
import feedparser
from collections import Counter

def feedGrabber(feed):
    parsed = feedparser.parse(feed)
    feed1 = []
    feed1.append(parsed.entries[0].title)
    feed1.append(parsed.entries[1].title)
    feed1.append(parsed.entries[3].title)
    feed1.append(parsed.entries[4].title)
    feed1.append(parsed.entries[5].title)
    feed1.append(parsed.entries[6].title)
    feed1.append(parsed.entries[7].title)
    feed1.append(parsed.entries[8].title)
    feed1.append(parsed.entries[9].title)
    feed1 = str(feed1)
    feedsplit = feed1
    feedsplit = feedsplit.translate(None, '\'')
    feedsplit = feedsplit.translate(None, 'u')
    feedsplit = feedsplit.translate(None, '[')
    feedsplit = feedsplit.translate(None, ']')
    feedsplit = str.lower(feedsplit)
    feedsplit = str.split(feedsplit)
    return(feedsplit)

reddit = feedGrabber("https://www.reddit.com/r/news/.rss")
cnn = feedGrabber('http://rss.cnn.com/rss/cnn_topstories.rss')
nyt = feedGrabber('http://rss.nytimes.com/services/xml/rss/nyt/HomePage.xml')

one = Counter(reddit)
two = Counter(cnn)
three = Counter(nyt)
pprint.pprint(one + two + three)

Tags: toimportnonetitlecounterparsedcnntranslate
1条回答
网友
1楼 · 发布于 2024-09-27 18:02:54

这是一个常用词的列表,你可以用列表理解法把它们从文本中删除

text = [ x for x in  text if not isCommon(x)]


   def isCommon(word):

    commonWords = ["the", "be", "and", "of", "a", "in", "to", "have", "it",
        "i", "that", "for", "you", "he", "with", "on", "do", "say", "this",
        "they", "is", "an", "at", "but","we", "his", "from", "that", "not",
        "by", "she", "or", "as", "what", "go", "their","can", "who", "get",
        "if", "would", "her", "all", "my", "make", "about", "know", "will",
        "as", "up", "one", "time", "has", "been", "there", "year", "so",
        "think", "when", "which", "them", "some", "me", "people", "take",
        "out", "into", "just", "see", "him", "your", "come", "could", "now",
        "than", "like", "other", "how", "then", "its", "our", "two", "more",
        "these", "want", "way", "look", "first", "also", "new", "because",
        "day", "more", "use", "no", "man", "find", "here", "thing", "give",
        "many", "well"]

    if word in commonWords:
        return True
    return False

相关问题 更多 >

    热门问题