tags = [{tag.name: tag.text.strip()} for tag in soup.find_all('h2')]
这返回为:
[{'h2':'My'},{'h2':'hey'}] # Returns all the h2 elements with their content.
现在我希望<script src =''>
中的所有链接都采用上述格式
假设,对于HTML代码
<script src="https://apis.google.com/_/scs/abc-static/_/js/k=gapi.gapi.en.hvE_rrhCzPE.O/m=gapi_iframes,googleapis_client/rt=j/sv=1/d=1/ed=1/rs=AHpOoo-98F2Gk-siNaIBZOtcWfXQWKdTpQ/cb=gapi.loaded_0" nonce="" async=""></script>
结果应该是
#Both Acceptable
[{'script':'https://apis.google.com/_/scs/abc-static/_/js/k=gapi.gapi.en.hvE_rrhCzPE.O/m=gapi_iframes,googleapis_client/rt=j/sv=1/d=1/ed=1/rs=AHpOoo-98F2Gk-siNaIBZOtcWfXQWKdTpQ/cb=gapi.loaded_0'}]
OR
[{'script src':'https://apis.google.com/_/scs/abc-static/_/js/k=gapi.gapi.en.hvE_rrhCzPE.O/m=gapi_iframes,googleapis_client/rt=j/sv=1/d=1/ed=1/rs=AHpOoo-98F2Gk-siNaIBZOtcWfXQWKdTpQ/cb=gapi.loaded_0'}]
您只需将正在查找的标记从
h2
更改为script
。然后,您需要的是tag['attribute name']
语法中的属性值,而不是使用tag.text
获取该元素的文本。比如:相关问题 更多 >
编程相关推荐