<p>考虑到您的html总是有那些<em>子菜单</em>div,可能更好的方法是以<code>cats[i]</code>对应<code>subcats[i]</code>的方式为类别返回一个列表,为子类别返回另一个列表,或者根据需要返回字典。在</p>
<p>在Python shell中:</p>
<pre><code>>>> from BeautifulSoup import BeautifulSoup
>>> html = '''<a class="menuitem submenuheader" href="#">Beverages</a>
... <div class="submenu">
... <ul>
... <li><a href="productlist.aspx?parentid=053&amp;catid=055">Juice</a></li>
... <li><a href="productlist.aspx?parentid=053&amp;catid=055">Milk</a></li>
... </ul>
... </div>
... <a class="menuitem submenuheader" href="#">DIY</a>
... <div class="submenu">
... <ul>
... <li><a href="productlist.aspx?parentid=053&amp;catid=055">Micellaneous</a></li>
... <li><a href="productlist.aspx?parentid=053&amp;catid=055">Spanners</a></li>
... <li><a href="productlist.aspx?parentid=053&amp;catid=055">Sockets</a></li>
... </ul>
... </div>'''
>>> soup = BeautifulSoup(html)
>>> categories = soup.findAll("a", {"class": 'menuitem submenuheader'})
>>> cats = [cat.text for cat in categories]
>>> sub_menus = soup.findAll("div", {"class": "submenu"})
>>> subcats = []
>>> for menu in sub_menus:
... subcat = [item.text for item in menu.findAll('li')]
... subcats.append(subcat)
...
>>> print cats
[u'Beverages', u'DIY']
>>> print subcats
[[u'Juice', u'Milk'], [u'Micellaneous', u'Spanners', u'Sockets']]
>>> cat_dict = dict(zip(cats,subcats))
>>> print cat_dict
{u'Beverages': [u'Juice', u'Milk'], u'DIY': [u'Micellaneous', u'Spanners', u'Sockets']}
</code></pre>