<p>从<a href="https://www.geeksforgeeks.org/python-count-the-sublists-containing-given-element-in-a-list/" rel="nofollow noreferrer">https://www.geeksforgeeks.org/python-count-the-sublists-containing-given-element-in-a-list/</a>调整方法3</p>
<pre><code>from itertools import chain
from collections import Counter
mylist = [[5274919, ["report", "porcelain", "firing", "technic"]], [5274920, ["implantology", "dentistry"]], [52749, ["method", "recognition", "long", "standing", "root", "perforation", "molar"]], [5274923, ["exogenic", "endogenic", "cause", "tooth", "jaw", "anomaly", "method", "method", "standing"]]]
myconcepts = ["method", "standing"]
def countList(lst, x):
" Counts number of times item x appears in sublists "
return Counter(chain.from_iterable(set(i[1]) for i in lst))[x]
# Use dictionary comprehension to apply countList to concept list
result = {x:countList(mylist, x) for x in myconcepts}
print(result) # {'method':2, 'standing':2}
</code></pre>
<p>*修改当前方法(只计算一次计数)*</p>
<pre><code>def count_occurences(lst):
" Number of counts of each item in all sublists "
return Counter(chain.from_iterable(set(i[1]) for i in lst))
cnts = count_occurences(mylist)
result = {x:cnts[x] for x in myconcepts}
print(result) # {'method':2, 'standing':2}
</code></pre>
<p><strong>性能(使用Jupyter笔记本比较发布的方法)</strong></p>
<p>结果表明,该方法与Barmar贴纸法相近(即36对42 us)</p>
<p>对当前方法的改进减少了大约一半的时间(即从36 us减少到19 us)。对于更多的概念(即问题有超过1000个概念),这种改进应该更为重要。你知道吗</p>
<p>然而,原来的方法速度更快,为2.55us/圈。你知道吗</p>
<p><em>方法当前方法</em></p>
<pre><code>%timeit { x:countList(mylist, x) for x in myconcepts}
#10000 loops, best of 3: 36.6 µs per loop
Revised current method:
%%timeit
cnts = count_occurences(mylist)
result = {x:cnts[x] for x in myconcepts}
10000 loops, best of 3: 19.4 µs per loop
</code></pre>
<p><em>方法2(来自Barmar post)</em></p>
<pre><code>%%timeit
r = collections.Counter(flatten(mylist))
{i:r.get(i, 0) for i in myconcepts}
# 10000 loops, best of 3: 42.7 µs per loop
</code></pre>
<p><em>方法3(原始方法)</em></p>
<pre><code>%%timeit
result = {}
for concept in myconcepts:
mycounting = 0
for item in mylist:
if concept in item[1]:
mycounting = mycounting + 1
result[concept] = mycounting
# 100000 loops, best of 3: 2.55 µs per loop
</code></pre>