<p>您可以使用beautifulsoup获取所有链接</p>
<pre><code>from bs4 import BeautifulSoup
import requests
import pandas as pd
url = 'https://www.nhc.noaa.gov/gis/'
res = requests.get(url)
soup = BeautifulSoup(res.text, "html.parser")
table = soup.find("table")
for anchor in table.find_all("a"):
print("Text - {}, Link - {}".format(anchor.get_text(strip=True), anchor["href"]))
</code></pre>
<p>输出:</p>
<pre><code>Text - Irma Example, Link - /gis/examples/al112017_5day_020.zip
Text - Cone, Link - /gis/examples/AL112017_020adv_CONE.kmz
Text - Track, Link - /gis/examples/AL112017_020adv_TRACK.kmz
Text - Warnings, Link - /gis/examples/AL112017_020adv_WW.kmz
Text - shp, Link - forecast/archive/al092020_5day_latest.zip
Text - Cone, Link - /storm_graphics/api/AL092020_CONE_latest.kmz
Text - Track, Link - /storm_graphics/api/AL092020_TRACK_latest.kmz
Text - Warnings, Link - /storm_graphics/api/AL092020_WW_latest.kmz
</code></pre>
<p>如果要保留数据帧,请不要通过<code>read_html</code>再次进行网络调用。重用响应对象</p>
<pre><code>df = pd.read_html(res.text)
</code></pre>
<p>要获得完整的链接,请将以下内容附加到所有链接</p>
<pre><code>https://www.nhc.noaa.gov
</code></pre>
<p>代码:</p>
<pre><code>for anchor in table.find_all("a"):
print("Text - {}, Link - {}".format(anchor.get_text(strip=True), prefix + anchor["href"]))
</code></pre>
<p>输出:</p>
<pre><code>Text - Irma Example, Link - https://www.nhc.noaa.gov/gis/examples/al112017_5day_020.zip
Text - Cone, Link - https://www.nhc.noaa.gov/gis/examples/AL112017_020adv_CONE.kmz
Text - Track, Link - https://www.nhc.noaa.gov/gis/examples/AL112017_020adv_TRACK.kmz
Text - Warnings, Link - https://www.nhc.noaa.gov/gis/examples/AL112017_020adv_WW.kmz
Text - shp, Link - https://www.nhc.noaa.govforecast/archive/al092020_5day_latest.zip
Text - Cone, Link - https://www.nhc.noaa.gov/storm_graphics/api/AL092020_CONE_latest.kmz
Text - Track, Link - https://www.nhc.noaa.gov/storm_graphics/api/AL092020_TRACK_latest.kmz
Text - Warnings, Link - https://www.nhc.noaa.gov/storm_graphics/api/AL092020_WW_latest.kmz
</code></pre>
<p>要下载文件,请再次使用<code>requests</code>并下载文件</p>