如何找到脚本src链接?(美丽的汤)

2024-09-30 18:15:58 发布

您现在位置:Python中文网/ 问答频道 /正文

tags = [{tag.name: tag.text.strip()} for tag in soup.find_all('h2')]

这返回为:

[{'h2':'My'},{'h2':'hey'}] # Returns all the h2 elements with their content.

现在我希望<script src =''> 中的所有链接都采用上述格式

假设,对于HTML代码

<script src="https://apis.google.com/_/scs/abc-static/_/js/k=gapi.gapi.en.hvE_rrhCzPE.O/m=gapi_iframes,googleapis_client/rt=j/sv=1/d=1/ed=1/rs=AHpOoo-98F2Gk-siNaIBZOtcWfXQWKdTpQ/cb=gapi.loaded_0" nonce="" async=""></script>

结果应该是

#Both Acceptable

[{'script':'https://apis.google.com/_/scs/abc-static/_/js/k=gapi.gapi.en.hvE_rrhCzPE.O/m=gapi_iframes,googleapis_client/rt=j/sv=1/d=1/ed=1/rs=AHpOoo-98F2Gk-siNaIBZOtcWfXQWKdTpQ/cb=gapi.loaded_0'}]

OR

[{'script src':'https://apis.google.com/_/scs/abc-static/_/js/k=gapi.gapi.en.hvE_rrhCzPE.O/m=gapi_iframes,googleapis_client/rt=j/sv=1/d=1/ed=1/rs=AHpOoo-98F2Gk-siNaIBZOtcWfXQWKdTpQ/cb=gapi.loaded_0'}]

Tags: httpssrccomtaggooglejsscriptstatic
1条回答
网友
1楼 · 发布于 2024-09-30 18:15:58

您只需将正在查找的标记从h2更改为script。然后,您需要的是tag['attribute name']语法中的属性值,而不是使用tag.text获取该元素的文本。比如:

tags = [{tag.name: tag['src']} for tag in soup.find_all('script')]

相关问题 更多 >