使用python的肯德尔陶统计局功能

Conference FiveYrIF 0 SIGMOD Conference 112.685585 1 KDD 103.674543 2 CHI 99.453096 3 SIGIR 68.967753 4 WWW 65.715631 5 SODA 60.151959 6 DAC 42.076365 7 ICCAD 39.906361 8 CIKM 33.232224 9 DATE 26.578906 10 INFOCOM 22.694122 11 Winter Simulation Conference 17.448830 12 SAC 10.646007

1条回答

网友

1楼 · 发布于 2024-09-28 22:25:48

显然，kendalltau不处理Pandas使用的对象数组。您可以通过在将其传递给kendalltau之前将其转换为字符串数组来解决此问题。在

例如，下面是一个数据帧：

In [107]: df
Out[107]: 
     x    y
0  aaa  0.5
1   bb  1.4
2    c  1.3
3    d  2.0
4   ee  2.1

x列中的值是字符串。Pandas将字符串数组表示为数据类型为object的数组：

^{pr2}$

kendalltau不处理这样的数组：

In [110]: kendalltau(df['x'], df['y'])
                                     -
TypeError                                 Traceback (most recent call last)
<ipython-input-110-07ca97e866e2> in <module>()
  > 1 kendalltau(df['x'], df['y'])

/Users/warren/anaconda/lib/python2.7/site-packages/scipy/stats/stats.pyc in kendalltau(x, y, initial_lexsort)
   3020     if initial_lexsort:
   3021         # sort implemented as mergesort, worst case: O(n log(n))
-> 3022         perm = np.lexsort((y, x))
   3023     else:
   3024         # sort implemented as quicksort, 30% faster but with worst case: O(n^2)

TypeError: merge sort not available for item 1

In [111]: kendalltau(df['x'].values, df['y'])
                                     -
TypeError                                 Traceback (most recent call last)
<ipython-input-111-e903a3b3475e> in <module>()
  > 1 kendalltau(df['x'].values, df['y'])

/Users/warren/anaconda/lib/python2.7/site-packages/scipy/stats/stats.pyc in kendalltau(x, y, initial_lexsort)
   3020     if initial_lexsort:
   3021         # sort implemented as mergesort, worst case: O(n log(n))
-> 3022         perm = np.lexsort((y, x))
   3023     else:
   3024         # sort implemented as quicksort, 30% faster but with worst case: O(n^2)

TypeError: merge sort not available for item 1

如果使用df['x'].values.astype(str)将数组转换为字符串数组，则可以使用：

In [112]: kendalltau(df['x'].values.astype(str), df['y'])
Out[112]: KendalltauResult(correlation=0.79999999999999982, pvalue=0.050043527347496564)

相关问题更多 >

编程相关推荐

热门问题

热门文章