让Python统计dict键的出现次数

2024-09-27 19:15:15 发布

您现在位置:Python中文网/ 问答频道 /正文

我有下面的代码,但不知道它失败的原因。在

tokentagocc={}
lemmatokentagdict= {
  "[',']": ("[',']", "['SYM.Pun.Comma']"), "['verursacht']": ("['verursachten']", "['ADJA.Pos.Nom.Pl.Masc']"), "['eine']": ("['Ein']", "['ART.Indef.Nom.Sg.Masc']"), "['Dollar']": ("['Dollar']", "['N.Reg.*2.*3.Masc']"), "['auf']": ("['auf']", "['APPR.Auf']"), "['Ausland']": ("['Ausland']", "['N.Reg.Acc.Sg.Neut']"), "['Soziale']": ("['Soziales']", "['N.Reg.Acc.Sg.Neut']"), "['Verkehr']": ("['Verkehr']", "['N.Reg.Acc.Sg.Masc']"), "['unterschlagen']": ("['unterschlagenen']", "['ADJA.Pos.Gen.Pl.Neut']"), "['rund']": ("['rund']", "['ADJD.Pos']"), "['staatlich']": ("['staatlichen']", "['ADJA.Pos.Gen.Pl.Neut']"), "['die']": ("['der']", "['ART.Def.Dat.Sg.Fem']"), "['alle']": ("['aller']", "['PRO.Indef.Attr.-3.Gen.Pl.Neut']"), "['für']": ("['für']", "['APPR.Acc']"), "['sie']": ("['sich']", "['PRO.Refl.Subst.3.Acc.Pl.*6']"), "['Milliarde']": ("['Milliarden']", "['N.Reg.Acc.Pl.Fem']"), "['in']": ("['ins']", "['APPRART.Acc.Sg.Neut']"), "['dadurch']": ("['dadurch']", "['PROADV.Dem']"), "['20']": ("['20']", "['CARD']"), "['weitergeleiten|weiterleiten']": ("['weitergeleitet']", "['VPP.Full.Psp']"), "['und']": ("['und']", "['CONJ.Coord.-2']"), '[]': ("['']", '[]'), "['Gesundheit']": ("['Gesundheit']", "['N.Reg.Acc.Sg.Fem']"), "['.']": ("['.']", "['SYM.Pun.Sent']"), "['Jahr']": ("['Jahr']", "['N.Reg.Dat.Sg.Neut']"), "['Geld']": ("['Gelder']", "['N.Reg.Gen.Pl.Neut']"), "['Rund']": ("['Rund']", "['ADJD.Pos']"), "['Schaden']": ("['Schäden']", "['N.Reg.Nom.Pl.Masc']"), "['Prozent']": ("['Prozent']", "['N.Reg.*2.*3.Neut']"), "['belaufen']": ("['beliefen']", "['VFIN.Full.3.Pl.Past.Ind']"), "['<unknown>']": ("['Verwaltungsmafia']", "['N.Reg.Dat.Sg.Fem']"), "['Teil']": ("['Teil']", "['N.Reg.Nom.Sg.Masc']"), "['von']": ("['von']", "['APPR.Dat']"), "['40']": ("['40']", "['CARD']"), "['werden']": ("['werde']", "['VFIN.Aux.3.Sg.Pres.Subj']")
}

for tokentag in lemmatokentagdict:
  print (lemmatokentagdict[tokentag]) 
  if lemmatokentagdict[tokentag] in tokentagocc.keys():
    tokentagocc[lemmatokentagdict[tokentag]]+=1 
    print ("doubled") 
  else:tokentagocc[lemmatokentagdict[tokentag]]=1

因为在第一个或第二个dict中都没有键,它不止一个,所以我不知道为什么(token,tag)的所有元组都把它计算为1。至少“Ein”这个词应该出现不止一次。在

用一个较小版本的脚本进行了测试,但是没有运气,所以我想发布完整的代码。我很乐意得到任何关于这方面的建议! 提前谢谢。在


Tags: posregsgnomdatgenaccpl

热门问题