Python线带

2024-10-05 11:35:00 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试使用strip()删除一些HTML的结尾。我们的想法是最终将其构建成一个循环,但目前我只是想弄清楚如何使其工作:

httpKey=("<a href=\"http://www.uaf.edu/academics/degreeprograms/index.html>Degree Programs</a>")
httpKeyEnd=">"

#the number in the httpKey that the httpKey end is at
stripNumber=(httpKey.find(httpKeyEnd))
#This is where I am trying to strip the rest of the information that I do not need. 
httpKey.strip(httpKey.find(httpKeyEnd))
print (httpKey)

最终的结果是将httpKey打印到屏幕上,只需:

a href="http://www.uaf.edu/academics/degreeprograms/index.html


Tags: thehttpindexthatishtmlwwwhref
2条回答

find将返回找到字符串的索引(一个数字),并且strip将删除字符串末端的字符;它不会删除“从该点向前的所有内容”。你知道吗

您想改用字符串切片:

>>> s = 'hello there: world!'
>>> s.index(':')
11
>>> s[s.index(':')+1:]
' world!'

如果您只想知道链接是什么,请使用类似^{}的库:

>>> from bs4 import BeautifulSoup as bs
>>> doc = bs('<a href="http://www.uaf.edu/academics/degreeprograms/index.html">Degree Programs</a>')
>>> for link in doc.find_all('a'):
...     print(link.get('href'))
...
http://www.uaf.edu/academics/degreeprograms/index.html

对于您的情况,这将起作用:

>>> httpKey=("<a href=\"http://www.uaf.edu/academics/degreeprograms/index.html>Degree Programs</a>")
>>> httpKey[1:httpKey.index('>')]
'a href="http://www.uaf.edu/academics/degreeprograms/index.html'

相关问题 更多 >

    热门问题