<p>如果你再进一步,你会在这里找到真正的数据:<a href="https://euraxess.ec.europa.eu/sites/default/files/exports/msca.xml" rel="nofollow noreferrer">https://euraxess.ec.europa.eu/sites/default/files/exports/msca.xml</a>
下面是一个使用SimplifiedDoc的示例</p>
<pre><code>from simplified_scrapy.request import req
from simplified_scrapy.simplified_doc import SimplifiedDoc
html = req.get('https://euraxess.ec.europa.eu/sites/default/files/exports/msca.xml')
doc = SimplifiedDoc(html)
jobs = doc.selects('job-opportunity')
for job in jobs:
print (job.select('job-id>text()'),job.select('job-title>text()'))
</code></pre>
<p>结果:</p>
<pre><code>367020 Early-Stage Researcher (ESR) 3-year PhD position - "Efficient intra-cavity and extra-cavity generation of beams with radial and azimuthal polarization in high-power thin-disk lasers" - Project: GREAT
377512 8 Short-term Early Stage Researcher positions available through the EvoCELL ITN (single cell genomics, evo-devo and science outreach)
383978 ESR (early stage researcher) for intelligent quality control cycles in Industry 4.0 process chains enabled by machine learning
......
</code></pre>