<p>您将按如下方式加载HTML:</p>
<pre><code>import requests
url = "https://index.minfin.com.ua/ua/economy/index/svg.php?indType=1&fromYear=2010&acc=1"
resp = requests.get(url)
data = resp.text
</code></pre>
<p>然后,您将使用此HTML创建BeatifulSoup对象</p>
<pre><code>from bs4 import BeautifulSoup
soup = BeautifulSoup(html, features="html.parser")
</code></pre>
<p>在此之后,如何解析出您想要的内容通常是非常主观的。候选代码可能变化很大。我就是这样做的:</p>
<p>使用BeautifulSoup,我解析了所有的“rect”,并检查该rect中是否存在“onmouseover”</p>
<pre><code>rects = soup.svg.find_all("rect")
yx_points = []
for rect in rects:
if rect.has_attr("onmouseover"):
text = rect["onmouseover"]
x_start_index = text.index("'") + 1
y_finish_index = text[x_start_index:].index("'") + x_start_index
yx = text[x_start_index:y_finish_index].split()
print(text[x_start_index:y_finish_index])
yx_points.append(yx)
</code></pre>
<p>从下图中可以看到,我刮取了<code>onmouseover=</code>部分,得到了那些<code>02.2015 155,1</code>部分</p>
<p>下面是<code>yx_points</code>现在的样子:</p>
<p><code>[['12.2009', '100,0'], ['01.2010', '101,8'], ['02.2010', '103,7'], ...]</code></p>
<p><a href="https://i.stack.imgur.com/gYax1.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/gYax1.png" alt="enter image description here"/></a></p>