用互相关矩阵检验多个数据集的相似性

A = np.array([0., 0, 0, 1., 2., 3., 4., 3, 2, 1, 0, 0, 0]) B = np.array([0., 0, 0, 0, 0, 1, 2., 3., 4, 3, 2, 1, 0]) C = np.array([0., 0, 0, 1, 1.5, 2, 1.5, 1, 0, 0, 0, 0, 0]) D = np.array([0., 0, 0, 0, 0, -2, -4, -2, 0, 0, 0, 0, 0]) x = np.arange(0,len(A),1)

for c1,w1 in enumerate([a,b,c,d]): for c2,w2 in enumerate([a,b,c,d]): w1 = np.abs(w1) w2 = np.abs(w2) M[c1,c2] = integrate.trapz(min(np.abs(w2).any(),np.abs(w1).any())) print M

M = np.zeros([4,4]) SH = np.zeros([4,4]) for c1,w1 in enumerate([a,b,c,d]): for c2,w2 in enumerate([a,b,c,d]): crossCorrelation = np.correlate(w1,w2, 'full') bestShift = np.argmax(crossCorrelation) # This reverses the effect of the padding. actualShift = bestShift - len(w2) + 1 similarity = crossCorrelation[bestShift] M[c1,c2] = similarity SH[c1,c2] = actualShift M = M/M.max() print M, '\n', SH

[[ 1. 1. 0.95454545 0.63636364] [ 1. 1. 0.95454545 0.63636364] [ 0.95454545 0.95454545 0.95454545 0.63636364] [ 0.63636364 0.63636364 0.63636364 0.54545455]] [[ 0. -2. 1. 0.] [ 2. 0. 3. 2.] [-1. -3. 0. -1.] [ 0. -2. 1. 0.]]

[[ 0.45833333 0.45833333 0.5 0.58333333] [ 0.45833333 0.45833333 0.5 0.58333333] [ 0.5 0.5 0.57142857 0.66666667] [ 0.58333333 0.58333333 0.66666667 1. ]] [[ 0. -2. 1. 0.] [ 2. 0. 3. 2.] [-1. -3. 0. -1.] [ 0. -2. 1. 0.]]

2条回答

网友

1楼 · 编辑于 2024-09-27 18:10:15

要在向量之间使用互相关：

例如：

>>> np.correlate(A,B)
array([ 31.])

>>> np.correlate(A,C)
array([ 19.])

>>> np.correlate(A,D)
array([-28.])

如果你不在乎符号，你可以简单地取绝对值。。。在

网友

2楼 · 编辑于 2024-09-27 18:10:15

正如ssm所说，numpy的关联函数对于这个问题很有效。你说过你对这个职位感兴趣。关联函数还可以帮助您判断一个序列与另一个序列的偏移量。在

import numpy as np

def compare(a, b):
    # 'full' pads the sequences with 0's so they are correlated
    # with as little as 1 actual element overlapping.
    crossCorrelation = np.correlate(a,b, 'full')
    bestShift = np.argmax(crossCorrelation)

    # This reverses the effect of the padding.
    actualShift = bestShift - len(b) + 1
    similarity = crossCorrelation[bestShift]

    print('Shift: ' + str(actualShift))
    print('Similatiy: ' + str(similarity))
    return {'shift': actualShift, 'similarity': similarity}

print('\nExpected shift: 0')
compare([0,0,1,0,0], [0,0,1,0,0])
print('\nExpected shift: 2')
compare([0,0,1,0,0], [1,0,0,0,0])
print('\nExpected shift: -2')
compare([1,0,0,0,0], [0,0,1,0,0])

编辑：

在关联每个序列之前，您需要将它们规范化，否则较大的序列将和所有其他序列具有非常高的相关性。在

互相关的一个特性是：

$\sum CrossCorrelate(f, g) = (\sum f) * (\sum g)$

所以，如果用每个序列除以它的和来规范化，相似度总是在0和1之间。在

我建议你不要取序列的绝对值。这会改变形状，而不仅仅是比例。例如np.abs（[1，-2]）==[1，2]。规范化将已经确保序列大部分是正的，加起来等于1。在

第二次编辑：

我意识到了。把信号想象成向量。规范化向量本身总是有一个最大点积。互相关只是在各种位移下计算的点积。如果像向量一样规范化信号（s除以sqrt（s dots s）），那么自相关总是最大和1。在

^{pr2}$

输出：

[[ 1.          1.          0.97700842  0.86164044]
[ 1.          1.          0.97700842  0.86164044]
[ 0.97700842  0.97700842  1.          0.8819171 ]
[ 0.86164044  0.86164044  0.8819171   1.        ]]
[[ 0. -2.  1.  0.]
[ 2.  0.  3.  2.]
[-1. -3.  0. -1.]
[ 0. -2.  1.  0.]]

相关问题更多 >

编程相关推荐

热门问题

热门文章