如何使用python中的soup获取价格

2024-06-28 11:21:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图从下面的html中获取一个项目的过程

这是src

 <span class="crwActualPrice">
        <span style="text-decoration: inherit; white-space: nowrap;">
            <span class="currencyINR">
                &nbsp;&nbsp;
            </span>
            <span class="currencyINRFallback" style="display:none">
                Rs. 
            </span>
            13,990.00
        </span>
    </span>

这是我试过的代码

    dprice = each_result.find_all("span", class_="crwActualPrice")
        for each_price in dprice:
            money_str = each_price.string
            print(money_str)

我想使用python soup获得money_str中的值13990


Tags: 项目textsrcstyle过程htmlpriceclass
3条回答

使用text()获取div之外的内容

...
dprice = each_result.find_all("span", class_="crwActualPrice")
for each_price in dprice:
    money_str += reach_price.text()
print(money_str.strip('&nbsp;'))

这应该管用。虽然我不是100%的边缘案件由于有限的数据集

In [1]: from bs4 import BeautifulSoup
In [2]: s = BeautifulSoup(''' <span class="crwActualPrice">
    ...:         <span style="text-decoration: inherit; white-space: nowrap;">
    ...:             <span class="currencyINR">
    ...:                 &nbsp;&nbsp;
    ...:             </span>
    ...:             <span class="currencyINRFallback" style="display:none">
    ...:                 Rs.
    ...:             </span>
    ...:             13,990.00
    ...:         </span>
    ...:     </span>''')

In [3]: for each in s.select('span.crwActualPrice'):
   ...:     print(each.get_text().strip().replace(' ','').replace('\n', ''))

使用soup.select函数:

from bs4 import BeautifulSoup

html_data = '''<span class="crwActualPrice">
        <span style="text-decoration: inherit; white-space: nowrap;">
            <span class="currencyINR">
                &nbsp;&nbsp;
            </span>
            <span class="currencyINRFallback" style="display:none">
                Rs. 
            </span>
            13,990.00
        </span>
    </span>'''

soup = BeautifulSoup(html_data, 'html.parser')
for curr in soup.select("span.crwActualPrice span.currencyINRFallback"):
    price = curr.nextSibling.strip()
    print(price)

印刷品:

13,990.00

相关问题 更多 >