擅长:python、mysql、java
<p>下面我写的代码适用于网页中的嵌入式javascript。在</p>
<pre><code>from lxml import html
from json import dump
import re
dumped_data = []
class theAddress:
latude = ""
URL = "http://hdfc.com/branch-locator"
var_lat = re.compile('(?<="latitude":").+?(?=")')
main_page = html.parse(URL).getroot()
residue = main_page.xpath("//script[@type='text/javascript']/text()")[1]
all_latude = re.findall(var_lat,residue)
for i in range(len(all_latude)):
obj = theAddress()
obj.latude = all_latude[i]
dumped_data.append(obj.__dict__)
f = open('hdfc_add.json','w')
dump(dumped_data, f, indent = 1)
</code></pre>
<p>它还使用json模块以适当的格式存储刮取的数据。在</p>