擅长:python、mysql、java
<p>开始吧!在</p>
<p>此代码将查找所有包含“GIS”字符串的<em>链接</em>。我需要添加<code>&in_iframe=1</code>以使第一个链接正常工作。在</p>
<pre><code>import urllib2
from bs4 import BeautifulSoup
urls = ['https://jobs-challp.icims.com/jobs/search?ss=1&searchKeyword=gis&searchCategory=&searchLocation=&latitude=&longitude=&searchZip=&searchRadius=20&in_iframe=1',
'https://www.smartrecruiters.com/SpectraForce1/']
for url in urls:
soup = BeautifulSoup(urllib2.urlopen(url))
print 'Scraping {}'.format(url)
for link in soup.find_all('a'):
if 'GIS' in link.text:
print ' > TEXT: ' + link.text.strip()
print ' > URL: ' + link['href']
print ''
</code></pre>
<p>输出:</p>
^{pr2}$