<p><strong>工作解决方案</strong>
寻求答案的人免责声明:此方法不适用于RARBG以外的任何网站</p>
<p>我把同样的问题贴到reddit的r/learnpython上,那里有人找到了一个很好的答案,满足了我的所有要求。您可以找到原始注释<a href="https://www.reddit.com/r/learnpython/comments/fpw6dy/how_would_i_scrape_a_website_which_uses/flnry8y?utm_source=share&utm_medium=web2x/" rel="nofollow noreferrer">here</a></p>
<p>他发现rarbg是从<a href="https://torrentapi.org/pubapi_v2.php?mode=search&search_string=QUERY&token=lnjzy73ucv&format=json_extended&app_id=lol" rel="nofollow noreferrer">here</a>获得信息的</p>
<p>您可以通过更改链接中的“查询”来更改搜索者。在那个页面上有每个torrent的所有信息,所以使用请求和bs4我收集了所有信息</p>
<p>以下是工作代码:</p>
<pre><code>query = input("Input a search: ")
rarbg_link = 'https://torrentapi.org/pubapi_v2.php?mode=search&search_string=' + query + '&token=lnjzy73ucv&format=json_extended&app_id=lol'
try:
request = requests.get(rarbg_link, headers={'User-Agent': 'Mozilla/5.0'})
except:
print("ERROR")
source = request.text
soup = str(BeautifulSoup(source, 'lxml'))
soup = soup.replace('<html><body><p>{"torrent_results":[', '')
soup = soup.split(',')
titles = str([i for i in soup if i.startswith('{"title":')])
titles = titles.replace('{"title":"', '')
titles = titles.replace('"', '')
titles = titles.split("', '")
for title in titles:
title.append(titles)
links = str([i for i in soup if i.startswith('"download":')])
links = links.replace('"download":"', '')
links = links.replace('"', '')
links = links.split("', '")
for link in links:
magnets.append(link)
</code></pre>