如何将一组字符串转换为python中类似[[string1,string2]]的列表?

2024-09-29 22:23:20 发布

您现在位置:Python中文网/ 问答频道 /正文

我有多个字符串,它们总是由Beautiful soup的for循环生成。你知道吗

html的结构是:

<div class="col-md-6 col-md-pull-6" id="general-vessel-info">
    <section class="promo-box vessel-type-unknown border-radius-0 border-0 padding-0">
        <div class="padding-t-10 padding-l-30 padding-r-30" style="padding-bottom:8px;">
            <ul class="list-unstyled margin-0">
                <li><span><i class="fa fa-calendar-check-o"></i> Latest Event</span> <span class="font-daxmedium padding-l-10" id="general-vessel-event">New ETA: 2017-12-19 10:00 UTC<a class="margin-l-10 tooltip-bs hidden-md tab-switch" data-original-title="Show all Events" href="#event-log" title=""><i class="fa fa-plus-circle"></i> more</a></span></li>
            </ul>
        </div>
        <div class="padding-t-10 padding-l-30 padding-b-10 padding-r-30" style="background-color:#F8F6F4;">
            <div class="row">
                <div class="col-lg-6">
                    <ul class="list-unstyled margin-0">
                        <li><span>Type</span> <span class="font-daxmedium" style="white-space:nowrap;">Tug <img alt="FleetMon Tug Icon" class="vessel-type tooltip-bs" data-original-title="Tug" src="//static.fleetmon.com/static/images/svg/types/unknown.svg" title=""></span></li>
                        <li><span>Flag</span> <span class="font-daxmedium"><span style="white-space:nowrap;">France <img alt="Flag of France" class="vessel-flag tooltip-bs" data-original-title="Flag of France" src="//static.fleetmon.com/static/images/svg/flags/fr.svg" title=""></span></span></li>
                        <li><span><abbr class="tooltip-bs" data-original-title="The unique ship identification number assigned by the International Maritime Organization" title="">IMO</abbr></span> <span class="font-daxmedium">9217474</span></li>
                        <li><span>MMSI</span> <span class="font-daxmedium">228058000</span></li>
                        <li><span>Callsign</span> <span class="font-daxmedium">FOUL</span></li>
                        <li><span>Year Built</span> <span class="font-daxmedium">–––</span></li>
                    </ul>
                </div>
                <div class="col-lg-6">
                    <ul class="list-unstyled margin-0 laksbfdabfg">
                        <li><span>Length</span> <span class="font-daxmedium">30 m</span></li>
                        <li><span>Width</span> <span class="font-daxmedium">10 m</span></li>
                        <li><span>Draught <abbr class="font-size-11 tooltip-bs" data-original-title="Average Draught" style="color:#898E89;" title="">Avg</abbr></span> <span class="font-daxmedium">4.1 m / <span class="tooltip-bs" data-original-title="Minimum Draught: 0.1 m&lt;br&gt;Maximum Draught: 6.4 m" style="cursor:default;" title="">...</span></span></li>
                        <li><span>Speed <abbr class="font-size-11 tooltip-bs" data-original-title="Average &amp; Maximum Speed" style="color:#898E89;" title="">Avg/Max</abbr></span> <span class="font-daxmedium">7.5 kn / 7.3 kn</span></li>
                        <li><span>Deadweight</span> <span class="font-daxmedium">385 tons</span></li>
                        <li><span>Gross Tonnage</span> <span class="font-daxmedium">456 tons</span></li>
                    </ul>
                </div>
            </div>
        </div>
        <footer class="text-center bg-color-FFFFFF" id="general-vessel-info-footer">
            <a class="btn btn-default tab-switch" href="#datasheet"><i class="fa fa-file-text-o"></i> Full Vessel Datasheet</a> <a class="btn btn-fm green margin-0 hidden-sm hidden-xs tab-switch" href="#datasheet" id="js-update-datasheet"><i class="fa fa-edit"></i> Update Datasheet</a>
        </footer>
    </section>
</div>

表达式为:

for ids in blockinfo.find_all('ul'):
    for li in ids.find_all('li'):
         print li.text.strip()

字符串输出如下:

string1 
string2 
string3 
string4 
string5 
string6

我需要创建这样一个列表:

[[string1,string2],[string3,string4][string5,string6]]

有什么帮助吗? 谢谢


Tags: divdatabstitlestyleliulclass
1条回答
网友
1楼 · 发布于 2024-09-29 22:23:20

如果要将字符串分组到每个<ul>元素,则将它们附加到在每个<ul>元素的循环内创建的列表中,然后在每次完成对所包含的<li>元素的循环时将该列表附加到结果列表中:

results = []
for ids in blockinfo.find_all('ul'):
    elements = []
    for li in ids.find_all('li'):
         elements.append(li.text.strip())
    results.append(elements)

你也可以用列表来理解这个内部循环:

results = []
for ids in blockinfo.find_all('ul'):
    results.append([
        li.text.strip() for li in ids.find_all('li')])

将外循环转换为列表也可以:

results = [
    [li.text.strip() for li in ids.find_all('li')]
    for ids in blockinfo.find_all('ul')]

所有这三种变体都产生相同的输出;从长远来看,选择最适合维护的一种。列表理解比var = []; for ... in ...: var.append(...)循环模式快一点。你知道吗

后者的演示,使用一些模拟输入:

>>> from bs4 import BeautifulSoup
>>> blockinfo = BeautifulSoup('''
... <ul><li>string1</li><li>string2</li></ul>
... <ul><li>string3</li><li>string4</li></ul>
... <ul><li>string5</li><li>string6</li></ul>
... ''', 'lxml')
>>> [
...     [li.text.strip() for li in ids.find_all('li')]
...     for ids in blockinfo.find_all('ul')]
[['string1', 'string2'], ['string3', 'string4'], ['string5', 'string6']]

对于您提供的示例HTML,上面将生成:

[['Latest Event New ETA: 2017-12-19 10:00 UTC more'],
 ['Type Tug',
  'Flag France',
  'IMO 9217474',
  'MMSI 228058000',
  'Callsign FOUL',
  'Year Built –––'],
 ['Length 30 m',
  'Width 10 m',
  'Draught Avg 4.1 m / ...',
  'Speed Avg/Max 7.5 kn / 7.3 kn',
  'Deadweight 385 tons',
  'Gross Tonnage 456 tons']]

相关问题 更多 >

    热门问题