检索包含某个词的所有“雅虎答案”问题。速率限制问题

2024-10-02 12:26:16 发布

您现在位置:Python中文网/ 问答频道 /正文

所以,我正在尝试编译一个数据库,其中包含雅虎答案中包含某个词的所有问题。我目前正在使用我编写的以下脚本来执行此操作,使用Pynswers包装类来调用yahooapi

from Answers import Answers

app = Answers()
wbk = xlwt.Workbook()


sheet = wbk.add_sheet('sheet 1')

app.appid = '...'
questions = app.questionSearch({'query':'tornado',})

#Write all column headings
sheet.write(0,0, 'Question')
sheet.write(0,1,'Answer')
sheet.write(0,2, 'Date')
sheet.write(0,3,'Number of Answers')


for i, value in enumerate(questions):
        content = value['Content'].strip()
        chosenAnswer = value['ChosenAnswer'].strip()
        date = value['Date'].strip()
        numAnswers = value['NumAnswers'].strip()

        #Write values into respect columns, (row, column)
        sheet.write(i+1,0,content)
        sheet.write(i+1,1,chosenAnswer)
        sheet.write(i+1,2,date)
        sheet.write(i+1,3,numAnswers)


wbk.save('C://test.xls')

问题是,我只能从这个问题中得到大约10个答案,而且我无法找到一个方法来扩展我得到的问题的范围。有什么想法吗?在


Tags: 答案appdatevaluecolumncontentwritesheet
1条回答
网友
1楼 · 发布于 2024-10-02 12:26:16

Pynswers似乎是Yahoo API itself的一个非常松散的包装。API文档显示在请求中使用“开始”和“结果”字段:

所以,也许你可以做以下事情:

first_50 = app.questionSearch({'query':'tornado', 'start' : 0, 'results' : 50})
next_50 = app.questionSearch({'query':'tornado', 'start' : 50, 'results' : 50})

编辑

此外,关于“利率限制”,Yahoo states in regard to their API(此部分摄于2013年3月7日):

How many times can I call YQL in a minute/hour/day?

Rate limits in YQL are based on your authentication. If you use IP-based authentication, then you are limited to 2,000 calls/hour/IP to the public YQL Web service URL (/v1/public/) or 20,000 calls/hour/IP to the private YQL Web service URL (/v1/yql/) that requires OAuth authorization. See the YQL Web Service URLs for the public and private URLs. Applications (identified by an Access Key) are limited to 100,000 calls/day/key*. However, in order to make sure the service is available for everyone we ask that you don't call YQL more than 0.2 times/second or 1,000 times/hour for IP authenticated users and 2.7 times/second or 10,000 times/hour.

*Please don't create multiple keys to 'avoid' rate limits. If you would like us to increase your limit please contact us with details of your project and we'll do our best to accommodate you.

显然,您需要小心处理代码,以确保在不超过速率限制的情况下获得所需的信息。因此,得到“全部”答案可能并不实际。在

相关问题 更多 >

    热门问题