我使用的是Pycharm社区版2020.3.2,学术版1.0.2,Tor版1.0.0。我试图从700篇文章中找出它们的引用次数。谷歌学者阻止我使用搜索酒吧(学术的一种功能)。然而,学术的另一个功能,即搜索作者,仍然运作良好。一开始,搜索酒吧功能运行正常。我试过这些密码
from scholarly import scholarly
scholarly.search_pubs('Large Batch Optimization for Deep Learning: Training BERT in 76 minutes')
经过几次试验,它显示了以下错误
Traceback (most recent call last):
File "C:\Users\binhd\anaconda3\envs\t2\lib\site-packages\IPython\core\interactiveshell.py", line 3343, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-9-3bbcfb742cb5>", line 1, in <module>
scholarly.search_pubs('Large Batch Optimization for Deep Learning: Training BERT in 76 minutes')
File "C:\Users\binhd\anaconda3\envs\t2\lib\site-packages\scholarly\_scholarly.py", line 121, in search_pubs
return self.__nav.search_publications(url)
File "C:\Users\binhd\anaconda3\envs\t2\lib\site-packages\scholarly\_navigator.py", line 256, in search_publications
return _SearchScholarIterator(self, url)
File "C:\Users\binhd\anaconda3\envs\t2\lib\site-packages\scholarly\publication_parser.py", line 53, in __init__
self._load_url(url)
File "C:\Users\binhd\anaconda3\envs\t2\lib\site-packages\scholarly\publication_parser.py", line 58, in _load_url
self._soup = self._nav._get_soup(url)
File "C:\Users\binhd\anaconda3\envs\t2\lib\site-packages\scholarly\_navigator.py", line 200, in _get_soup
html = self._get_page('https://scholar.google.com{0}'.format(url))
File "C:\Users\binhd\anaconda3\envs\t2\lib\site-packages\scholarly\_navigator.py", line 152, in _get_page
raise Exception("Cannot fetch the page from Google Scholar.")
Exception: Cannot fetch the page from Google Scholar.
然后,我发现原因是我需要从Google传递验证码,以便继续从Google Scholar获取信息。很多人建议我需要使用代理,因为我的IP被谷歌屏蔽了。我尝试使用FreeProxies()更改代理
from scholarly import scholarly, ProxyGenerator
pg = ProxyGenerator()
pg.FreeProxies()
scholarly.use_proxy(pg)
scholarly.search_pubs('Large Batch Optimization for Deep Learning: Training BERT in 76 minutes')
它不起作用,Pycharm被冻结了很长时间。然后,我安装了Tor(pip安装Tor)并重试:
from scholarly import scholarly, ProxyGenerator
pg = ProxyGenerator()
pg.Tor_External(tor_sock_port=9050, tor_control_port=9051, tor_password="scholarly_password")
scholarly.use_proxy(pg)
scholarly.search_pubs('Large Batch Optimization for Deep Learning: Training BERT in 76 minutes')
它不起作用。然后,我尝试使用SingleProxy()
from scholarly import scholarly, ProxyGenerator
pg = ProxyGenerator()
pg.SingleProxy(https='socks5://127.0.0.1:9050',http='socks5://127.0.0.1:9050')
scholarly.use_proxy(pg)
scholarly.search_pubs('Large Batch Optimization for Deep Learning: Training BERT in 76 minutes')
它也不起作用。我从未尝试过Luminati,因为我不熟悉它。如果有人知道解决方案,请帮助
目前没有回答
相关问题 更多 >
编程相关推荐