从一个文件写入另一个python

<div class='entry qual-5 used-demoman slot-head bestprice custom' data-price='3280000' data-name="Kill-a-Watt Allbrero" data-quality="5" data-australium="normal" data-class="demoman" data-particle_effect="56" data-paint="" data-slot="cosmetic" data-consignment="consignment">

2条回答

网友

1楼 · 编辑于 2024-09-30 19:27:21

让我们尝试使用常见的字符串操作方法来查找值：

>>> line = '''<div class='entry qual-5 used-demoman slot-head bestprice custom' data-price='3280000' data-name="Kill-a-Watt Allbrero" data-quality="5" data-australium="normal" data-class="demoman" data-particle_effect="56" data-paint="" data-slot="cosmetic" data-consignment="consignment">'''

我们可以使用^{}来查找字符串在字符串中的位置：

>>> line.index('data-name')
87

现在我们知道我们需要开始为我们感兴趣的属性寻找索引87：

>>> line[87:]
'data-name="Kill-a-Watt Allbrero" data-quality="5" data-australium="normal" data-class="demoman" data-particle_effect="56" data-paint="" data-slot="cosmetic" data-consignment="consignment">'

现在，我们也需要删除data-name="部分：

>>> start = line.index('data-name') + len('data-name="')
>>> start
98
>>> line[start:]
'Kill-a-Watt Allbrero" data-quality="5" data-australium="normal" data-class="demoman" data-particle_effect="56" data-paint="" data-slot="cosmetic" data-consignment="consignment">'

现在，我们只需要找到右引号的索引，然后我们就可以提取属性值：

>>> end = line.index('"', start)
>>> end
118
>>> line[start:end]
'Kill-a-Watt Allbrero'

然后我们有了我们的解决方案：

start = line.index('data-name') + len('data-name="')
end = line.index('"', start)
print(line[start:end])

我们可以把它放在循环中：

with open('itemlist.txt','r') as mfile, open('output.txt','a') as mfile2w
    for line in mfile:
        start = line.index('data-name') + len('data-name="')
        end = line.index('"', start)
        mfile2.write(line[start:end])
        mfile2.write('\n')

网友

2楼 · 编辑于 2024-09-30 19:27:21

也可以使用beautifulsoup：

a.html：

<html>
    <head>
        <title> Asdf </title>
    </head>
    <body>

        <div class='entry qual-5 used-demoman slot-head bestprice custom' data-price='3280000' data-name="Kill-a-Watt Allbrero" data-quality="5" data-australium="normal" data-class="demoman" data-particle_effect="56" data-paint="" data-slot="cosmetic" data-consignment="consignment">

    </body>
</html>

a.py：

from bs4 import BeautifulSoup
with open('a.html') as f:
    lines = f.readlines()
soup = BeautifulSoup(''.join(lines), 'html.parser')
result = soup.findAll('div')[0]['data-price']
print result
# prints 3280000

我的观点是，如果您的任务像您的示例中那样简单，那么实际上没有必要使用beautifulsoup。但是，如果它更复杂，或者它会更复杂。考虑用beautifulsoup试试。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章