回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>因此,我使用redditapi,由于一些与本案例无关的原因,我希望在这种情况下不使用reddit包装器。代码其实很简单,它从subreddit中的一个特定帖子中提取评论和1级回复。你知道吗</p>
<p>这是函数的代码</p>
<pre><code>def getcommentsforpost(subredditname,postid,):
#here we make the request to reddit, and create a python dictionary
#from the resulting json code
reditpath = '/r/' + subredditname + '/comments/' + postid
redditusual = 'https://www.reddit.com'
parameters = '.json?'
totalpath = redditusual + reditpath + parameters
p = requests.get(totalpath, headers = {'User-agent' : 'Chrome'})
result = p.json()
#we are going to be looping a lot through dictionaries, to extract
# the comments and their replies, thus, a list where we will insert
# them.
totallist = []
# the result object is a list with two dictionaries, one with info
#on the post, and the second one with all the info regarding the
#comments and their respective replies, because of this, we first
# process the posts info located in result[0]
a = result[0]["data"]["children"][0]["data"]
abody = a["selftext"]
aauthor = a["author"]
ascore = a["score"]
adictionary = {"commentauthor" : aauthor , "comment" : abody , "Type" : "Post",
"commentscore" : ascore}
totallist.append(adictionary)
# and now, we start processing the comments, located in result[1]
for i in result[1]["data"]["children"]:
ibody = i["data"]["body"]
iauthor = i["data"]["author"]
iscore = i["data"]["score"]
idictionary = {"commentauthor" : iauthor , "comment" : ibody , "Type" : "post_comment",
"commentscore" : iscore}
totallist.append(idictionary)
# to clarify, until here, the code works perfectly. No problem
# whatsoever, its exactly in the following section where the
#error happens.
# we create a new object, called replylist,
#that contains a list of dictionaries in every interaction of
#the loop.
replylists = i["data"]["replies"]["data"]["children"]
# we are going to loop through them, in every comment we extract
for j in replylists:
jauthor = j["data"]["author"]
jbody = j["data"]["body"]
jscore = j["data"]["score"]
jdictionary = {"commentauthor" : jauthor , "comment" : jbody , "Type" : "comment_reply" ,
"commentscore" : jscore }
totallist.append(jdictionary)
# just like we did with the post info and the normal comments,
# we extract and put it in totallist.
finaldf = pd.DataFrame(totallist)
return(finaldf)
getcommentsforpost("Python","a7zss0")
</code></pre>
<p>但是在对回复执行循环时,代码失败了。它返回这个错误“string indexes must be integers”,向变量replylists发出错误信号,但是当我在循环外执行这样的代码时</p>
<pre><code>result[1]["data"]["children"][4]["data"]["replies"]["data"]["children"][0]
</code></pre>
<p>效果很好,应该是一样的效果。我相信它将replylists视为一个字符串,而不是一个列表(这是它的类)</p>
<p>我尝试过的事情:</p>
<p>我尝试确保replylists类是一个带有type()函数的列表,它证明返回“list”,但是对于循环的5次交互,它失败了,并且出现了相同的错误。你知道吗</p>
<p>我尝试使用<code>for ja in range(0,len(replylists))</code>创建列表循环,然后将<code>j</code>变量创建为<code>replylists[ja]</code>。它返回了同样的错误。你知道吗</p>
<p>我已经调试了两个小时了,如果没有代码片段,这个函数可以很好地工作(当然,它不会在最终的数据帧中返回回复,但是它可以工作)。为什么会这样?<code>replylists</code>是一个字典列表,不是字符串,但它给出了一个奇怪的错误。你知道吗</p>
<p>下面是我们正在使用的函数的reddit文档:
<a href="https://www.reddit.com/dev/api#GET_comments_" rel="nofollow noreferrer">https://www.reddit.com/dev/api#GET_comments_</a>{文章}</p>
<p>要导入的库:
请求,
作为警察,
json文件</p>
<p>我重复一遍,推荐包装器不是一个解决方案,我想用json和rest来解决这个问题。你知道吗</p>
<p>正在处理此问题:
'Python版本3.6.5 | Anaconda版本5.2.0,jupyter笔记本5.5.0'</p>
<p>先谢谢你。希望它变得有趣,我会继续从这里工作。你知道吗</p>