我有两个数据帧一个是topic_
,它是目标数据帧,tw
是源数据帧。topic_
是一个逐词矩阵,每个单元格存储一个词出现在特定主题中的概率。我已经使用将topic_
数据帧初始化为零数字0. tw
数据帧的示例-
print(tw)
topic_id word_prob_pair
0 0 [(customer, 0.061703717964), (team, 0.01724444...
1 1 [(team, 0.0260560163563), (customer, 0.0247838...
2 2 [(customer, 0.0171786268847), (footfall, 0.012...
3 3 [(team, 0.0290787264225), (product, 0.01570401...
4 4 [(team, 0.0197917953222), (data, 0.01343226630...
5 5 [(customer, 0.0263740639141), (team, 0.0251677...
6 6 [(customer, 0.0289764173735), (team, 0.0249938...
7 7 [(client, 0.0265082412402), (want, 0.016477447...
8 8 [(customer, 0.0524006965405), (team, 0.0322975...
9 9 [(generic, 0.0373422774996), (product, 0.01834...
10 10 [(customer, 0.0305256248248), (team, 0.0241559...
11 11 [(customer, 0.0198707090364), (ad, 0.018516805...
12 12 [(team, 0.0159852971954), (customer, 0.0124540...
13 13 [(team, 0.033444510469), (store, 0.01961003290...
14 14 [(team, 0.0344793243818), (customer, 0.0210975...
15 15 [(team, 0.026416114692), (customer, 0.02041691...
16 16 [(campaign, 0.0486186973667), (team, 0.0236024...
17 17 [(customer, 0.0208270072145), (branch, 0.01757...
18 18 [(team, 0.0280889397541), (customer, 0.0127932...
19 19 [(team, 0.0297011415217), (customer, 0.0216007...
我的主题dataframe的大小是num_topics
(它是20)number_of_unique_words
(在tw
数据帧中)
下面是我用来替换topic_
数据帧中每个值的代码
有没有更好的方法来完成这项任务?在
numpy
定时
您可以将} 将}:
list comprehension
与DataFrame
构造函数一起使用,最后用^{NaN
替换为{相关问题 更多 >
编程相关推荐