比较reddit post标题并不重要

2024-09-29 21:37:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用Python的PRAW函数制作一个redditbot。bot本身检查subreddit中的一个帖子是否在标题中包含特定的单词,如果是,则将其交叉发布到另一个子帖子。然而,一旦发生这种情况,这个过程就会重复创建重复的交叉帖子,我尝试用标题比较来抵消这种情况,以便它过滤已经存在的帖子

我试图比较两个独立reddit提交的字符串。所以如果两个提交的标题匹配,不要发布,如果他们不匹配,那么发布

下面的代码在另一个for循环中,该循环检查其他子reddit的内容,该代码工作正常,下面的for循环给我带来了问题

变量名(例如realtitle和realtitle1)存储第一个for循环的提交标题的原始标题

为错误的代码和变量命名方案道歉

import praw
from PyDictionary import PyDictionary
import enchant
from twisted.internet import task, reactor

timeout = 1800.0;

def FunctionName():
    reddit = praw.Reddit(senstiveredditinfohere)
    subreddit = reddit.subreddit("sub1");
    source = reddit.subreddit("sub2");

    for submission in source.new(limit=50):
        realsubmission = submission; 
        title = submission.title.lower();
        realtitle = submission.title;
        realrealtitle = realtitle + " (by X)"
        title1 = submission.title + " (by X)";

        for submission1 in subreddit.new(limit=200):

            if submission1.title == realrealtitle:
                continue;
            elif submission1.title != realrealtitle:
                if "string1" in title:
                    realsubmission.crosspost(subreddit, title=realtitle + " (by X)");
                    title="";

                    realtitle = "";
                    continue;
                elif "string2" in title:
                    realsubmission.crosspost(subreddit, title=realtitle + " (by X)");
                    title="";

                    realtitle = "";
                    continue;
                elif "string3" in title:
                    realsubmission.crosspost(subreddit, title=realtitle + " (by X)");
                    title="";

                    realtitle = "";
                    continue;
                elif "string4" in title:
                    realsubmission.crosspost(subreddit, title=realtitle + " (by X)");
                    title="";

                    realtitle = "";
                    continue;
                elif "string5" in title:
                    realsubmission.crosspost(subreddit, title=realtitle + " (by X)");
                    title="";

                    realtitle = "";
                    continue;
                else:

                    break;
            else:
                break;


FunctionName()

l = task.LoopingCall(FunctionName)
l.start(timeout)

reactor.run()

Tags: inimport标题submissionforbytitle帖子
2条回答

if语句中,如果title不匹配,则将打破for循环。改用continue语句

if submission1.title == realrealtitle:
        continue; # try the next submission in subreddit.new(limit=200)

如果标题匹配,你想试试下一个

还有两件事上一个中断是不可检查的,您正在标题中存储提交的较低版本。标题,请确保“string1”、“string2”等。。。也比较低

我认为两个嵌套循环的逻辑是根本错误的。目前的工作原理如下:

for x in first_iterable:
    for y in second_iterable:
        if x != y:
            do_something(x)

试着用几个range作为你的iterables,我想你会发现它并没有达到你想要的效果。do_something(x)调用将在xy值不匹配的每一次发生,每个x可能发生很多次

对于您的用例,您只希望它每x运行一次,并且仅当它从不匹配y时。为此,您可能需要更像这样的代码:

for x in first_iterable:
    if x not in second_iterable:
        do_something(x)

编写not in测试的另一种方法是使用all(x != y for y in second_iterable)(这对于比简单不等式更复杂的测试更方便)。此代码将在运行do_something(x)之前检查second_iterable中的所有值,而不是在有许多不匹配对的情况下重复执行

将其放在当前代码中(将冗余的标题相关变量删减为一个,并给它一个更好的名称):

for submission in source.new(limit=50):
    crosspost_title = submission.title + " (by X)"
    if all(crosspost_title != other.title for other in subreddit.new(limit=200)):
        ...

请注意,如果将关键字检查(例如"string1")移到检查文章是否已交叉发布之前,可能会提高代码的效率。我建议对该测试使用any,而不是使用一大堆代码相同的if测试(if any(keyword in title for keyword in ["string1", "string2", ...]):

相关问题 更多 >

    热门问题