我正在尝试使用scrapy刮取API。我尝试了各种方法,但都没有做到,但我成功地在anaconda中以json的形式返回了api!但我似乎不知道如何解析它
json代码如下所示:
[
{
"id": "6975526564553428229",
"secretID": "6975526564553428229",
"text": "I did a thing 👀 Pre-order link in my bio!! #WillTheBook",
"createTime": 1624116353,
"authorMeta": {
"id": "6727327145951183878",
"secUid": "MS4wLjABAAAA8ezUaW4ecJX222ObGXxt07F9BIh4QH3-g1P1DHyChT2LLi2cn-vAE2R53-
H672ZO",
"name": "willsmith",
"nickName": "Will Smith",
"verified": true,
"signature": "Same kid from West Philly.",
"avatar": "https://p16-sign-va.tiktokcdn.com/musically-maliva-
obj/1646315618666501~c5_1080x1080.jpeg?x-expires=1624215600&x-
signature=JWCnkyJ1Lq7G6K3W32nSB4NKc%2Fk%3D",
"following": 24,
"fans": 55900000,
"heart": 340900000,
"video": 81,
"digg": 88
},
在阅读了一些教程之后,我所做的是:
# -*- coding: utf-8 -*-
import requests
import scrapy
import json
from pprint import pprint
from ..items import TiktokscrapyItem
from scrapy.crawler import CrawlerProcess
from datetime import datetime
def send_request():
response = requests.get(
url="https://app.scrapingbee.com/api/v1/store/tiktok/user-feed",
params={
"api_key": "api key hiudden",
"username": "willsmith",
},
)
print('Response HTTP Status Code: ', response.status_code)
print('Response HTTP Response Body: ', response.content)
send_request()
class tiktokSpider(scrapy.Spider):
name = 'tiktok'
allowed_domains = ['app.scrapingbee.com']
custom_settings = {'CONCURRENT_REQUESTS_PER_DOMAIN': 10}
custom_settings = {'FEEDS':{'poststoday.csv':{'format':'csv'}}}
def parse(self, response):
authorMeta = json.loads(response.body_as_unicode())
print(authorMeta)
#main driver
if __name__ == "__main__":
process = CrawlerProcess()
process.crawl(tiktokSpider)
process.start()
我不知道该怎么办,我只想抓取文本、createtime和昵称,但想不出来!有什么建议吗
目前没有回答
相关问题 更多 >
编程相关推荐