<p>虽然其他人写了非常好的答案,但我觉得通过<code>pandas</code>解决这个问题更容易维护,也更冗长。加上熊猫的对象真的很容易处理</p>
<p>首先是进口:</p>
<pre><code>import pandas as pd
import calendar
from pprint import pprint
</code></pre>
<p>以下是代码的主体:</p>
<pre><code>df = pd.DataFrame(data, columns=["lists", "month"])
names = list(set([y for x in df["lists"] for y in x]))
df[names] = 0
def func(row):
for n in names:
for k in row["lists"]:
if k == n:
row[n] += 1
return row
df = df.apply(func, axis=1)
df.drop(["lists"], inplace=True, axis=1)
new_df = df.groupby(by="month").sum().T.reset_index()
new_df.columns.name = None # Just for my taste to remove the "month" label of groupby result
months = list(calendar.month_name)[1:] # list of months. There's an empty string at index 0.
new_df[[m for m in months if m not in new_df.columns]] = 0 #Creating columns for unseen months
new_df = new_df[["index"] + months] #sorting the months
print(new_df)
pprint(new_df.values.tolist())
</code></pre>
<p>输出将是:</p>
<pre><code> index January February ... October November December
0 kanker 0 0 ... 1 0 0
1 covid-19 0 0 ... 19 0 0
2 jantung 0 0 ... 2 0 0
3 tiktok 0 0 ... 1 0 0
4 tenaga kesehatan 0 0 ... 1 0 0
[5 rows x 13 columns]
[['kanker', 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
['covid-19', 0, 0, 0, 0, 0, 0, 1, 17, 15, 19, 0, 0],
['jantung', 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 0, 0],
['tiktok', 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
['tenaga kesehatan', 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]]
</code></pre>
<p>产出将是:</p>
<pre><code> index January February ... October November December
0 tenaga kesehatan 0 0 ... 1 0 0
1 covid-19 0 0 ... 19 0 0
2 kanker 0 0 ... 1 0 0
3 jantung 0 0 ... 2 0 0
4 tiktok 0 0 ... 1 0 0
[5 rows x 13 columns]
[['tenaga kesehatan', 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
['covid-19', 0, 0, 0, 0, 0, 0, 1, 17, 15, 19, 0, 0],
['kanker', 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
['jantung', 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 0, 0],
['tiktok', 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]]
</code></pre>