<p>尝试bs4</p>
<pre><code>from bs4 import BeautifulSoup
page = '''
<camera>
<maker>Fujifilm</maker>
<model>GFX 50S</model>
<mount>Fujifilm G</mount>
<cropfactor>0.79</cropfactor>
</camera>
'''
soup = BeautifulSoup(page, 'lxml')
make = soup.find('maker')
model = soup.find('model')
print(f'Make: {make.text}\nModel: {model.text}')
</code></pre>
<p>对于多个条目,只需使用find\u all()循环遍历它们</p>
<pre><code>from bs4 import BeautifulSoup
page = '''
<camera>
<maker>Fujifilm</maker>
<model>GFX 50S</model>
<mount>Fujifilm G</mount>
<cropfactor>0.79</cropfactor>
</camera>
<camera>
<maker>thing1</maker>
<model>thing2</model>
<mount>Fujifilm G</mount>
<cropfactor>0.79</cropfactor>
</camera>
<camera>
<maker>thing3</maker>
<model>thing4</model>
<mount>Fujifilm G</mount>
<cropfactor>0.79</cropfactor>
</camera>
<camera>
<maker>thing5</maker>
<model>thing6</model>
<mount>Fujifilm G</mount>
<cropfactor>0.79</cropfactor>
</camera>
'''
soup = BeautifulSoup(page, 'lxml')
make = soup.find_all('maker')
model = soup.find_all('model')
for x, y in zip(make, model):
print(f'Make: {x.text}\nModel: {y.text}')
</code></pre>
<p>通过文件获取数据:</p>
<pre><code>from bs4 import BeautifulSoup
with open('path/to/your/file') as file:
page = file.read()
soup = BeautifulSoup(page, 'lxml')
make = soup.find_all('maker')
model = soup.find_all('model')
for x, y in zip(make, model):
print(f'Make: {x.text}\nModel: {y.text}')
</code></pre>
<p>不导入任何模块:</p>
<pre><code>with open('/PATH/TO/YOUR/FILE') as file:
for line in file:
for each in line.split():
if "maker" in each:
each = each.replace("<maker>", "")
print(each.replace("</maker>", ""))
</code></pre>
<p>这仅适用于'maker'标记,将这些标记拆分为单独的定义并遍历它们可能是有益的</p>