我想从这个数据帧获取标签的分布:
df=pd.DataFrame([
[43,{"tags":["webcom","start","temp","webcomfoto","dance"],"image":["https://image.com/Kqk.jpg"]}],
[83,{"tags":["yourself","start",""],"image":["https://images.com/test.jpg"]}],
[76,{"tags":["en","webcom"],"links":["http://webcom.webcomdb.com","http://webcom.webcomstats.com"],"users":["otole"]}],
[77,{"tags":["webcomznakomstvo","webcomzhiznx","webcomistoriya","webcomosebe","webcomfotografiya"],"image":["https://images.com/nt4wzguoh/y_a3d735b4.jpg","https://images.com/sucb0u24x/b1sd_Naju.jpg"]}],
[81,{"tags":["webcomfotografiya"],"users":["myself","boattva"],"links":["https://webcom.com/nk"]}],
],columns=["_id","tags"])
我需要得到一个表,其中的'id'和特定数量的标签。 例如
^{pr2}$当“tags”是唯一的字段时,我使用了this approach。在这个数据框中,我还有“image”、“users”和其他带值的文本字段。在这种情况下,我应该如何处理数据?在
谢谢你
您可以使用str访问器来获取字典键,并使用}:
value_counts
获取{输出:
^{pr2}$坚持
collections.Counter
,有一种方法:列
tags
中的数据是strings
,不是dictionaries
,有问题。在所以需要第一步:
然后应用原始答案,如果有多个字段,效果非常好。在
正在验证:
^{pr2}$相关问题 更多 >
编程相关推荐