Python包,从整个互联网上抓取食谱
bvodola-recipe-scrapers的Python项目详细描述
一个简单的网站抓取工具配方网站。在
pip install bvodola-recipe-scrapers
然后:
^{pr2}$注意:scraper.links()返回包含所有<;a>;标记属性的字典列表。属性名是字典键。在
刮板可用于:
- https://www.acouplecooks.com
- https://allrecipes.com/
- https://archanaskitchen.com/
- https://averiecooks.com/
- https://bbc.com/
- https://bbc.co.uk/
- https://bbcgoodfood.com/
- https://bettycrocker.com/
- https://bonappetit.com/
- https://bowlofdelicious.com/
- https://budgetbytes.com/
- https://closetcooking.com/
- https://cookieandkate.com/
- https://cookpad.com/
- https://cookstr.com/
- https://copykat.com/
- https://countryliving.com/
- https://cybercook.com.br/
- https://delish.com/
- https://epicurious.com/
- https://fifteenspatulas.com/
- https://finedininglovers.com/
- https://fitmencook.com/
- https://food.com/
- https://foodnetwork.com/
- https://foodrepublic.com/
- https://geniuskitchen.com/
- https://giallozafferano.it/
- https://gimmesomeoven.com/
- https://gonnawantseconds.com/
- https://gousto.co.uk/
- https://greatbritishchefs.com/
- https://halfbakedharvest.com/
- https://heinzbrasil.com.br/
- https://hellofresh.com/
- https://hellofresh.co.uk/
- https://hostthetoast.com/
- https://101cookbooks.com/
- https://receitas.ig.com.br/
- https://inspiralized.com/
- https://jamieoliver.com/
- https://justbento.com/
- https://kennymcgovern.com/
- https://kochbar.de/
- https://lovingitvegan.com/
- https://lecremedelacrumb.com/
- https://marmiton.org/
- https://matprat.no/
- http://mindmegette.hu/
- https://minimalistbaker.com/
- https://misya.info/
- https://momswithcrockpots.com/
- http://motherthyme.com/
- https://mybakingaddiction.com/
- https://myrecipes.com/
- https://healthyeating.nhlbi.nih.gov/
- https://cooking.nytimes.com/
- https://ohsheglows.com/
- https://www.panelinha.com.br/
- https://paninihappy.com/
- https://przepisy.pl/
- https://realsimple.com/
- https://seriouseats.com/
- https://simplyquinoa.com/
- https://simplyrecipes.com/
- https://skinnytaste.com/
- https://southernliving.com/
- https://spendwithpennies.com/
- https://steamykitchen.com/
- https://tastesoflizzyt.com
- https://tasteofhome.com
- https://tastykitchen.com/
- https://thehappyfoodie.co.uk/
- https://thekitchn.com/
- https://thepioneerwoman.com/
- https://thespruceeats.com/
- https://thevintagemixer.com/
- https://thewoksoflife.com/
- https://tine.no/
- https://tudogostoso.com.br/
- https://twopeasandtheirpod.com/
- https://vegolosi.it/
- https://watchwhatueat.com/
- https://whatsgabycooking.com/
- https://en.wikibooks.org/
- https://yummly.com/
贡献
我希望这个开源的部分原因是,如果一个网站做了一个设计变更,它的刮板应该修改。在
如果你发现一个设计变化(或其他东西),使刮板无法为给定的网站工作-请尽快发出一个问题。在
如果你是程序员PRs与修复是热烈欢迎和承认与虚拟啤酒。在
如果你想为一个新的网站添加刮刀
- 打开一个提供站点名称的Issue,以及其中的配方链接。在
- 您是一名开发人员,希望自己编写scraper:
- 如果站点上有可用的模式-you can do this 在
- 否则,清除HTML-like this
开发人员/贡献
假设您已经安装了python3,请导航到您希望此项目所在的目录并删除这些行
git clone git@github.com:hhursev/recipe-scrapers.git && cd recipe-scrapers && python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt && pre-commit install && python -m coverage run -m unittest && python -m coverage report
常见问题解答
- 如何知道网站是否有配方模式?
- 去你想要支持的网站上的食谱。在
- {tt3}点击你的键盘^
- 在(Ctrl -f)中搜索application/ld+json。它应该在script标记内。在
- 如果你找到了,那么你的网站很可能支持配方模式。否则,您将需要解析HTML。在
空间感谢:
所有的contributors that helped improving包。你真棒!在
- 项目
标签: