水疗。spams.lassoWeighted酒店输出错误?

2024-10-04 03:20:17 发布

您现在位置:Python中文网/ 问答频道 /正文

大家晚上好。我不能理解函数的输出Spams.lassoWeighted酒店. 如果你在他们的页面上运行这个例子 http://spams-devel.gforge.inria.fr/doc-python/html/doc_spams005.html#sec16

import spams
import numpy as np
np.random.seed(0)
print "test lasso weighted"
##############################################
# Decomposition of a large number of signals
##############################################
# data generation
X = np.asfortranarray(np.random.normal(size=(64,10000)))
X = np.asfortranarray(X / np.tile(np.sqrt((X*X).sum(axis=0)),(X.shape[0],1)),dtype= myfloat)
D = np.asfortranarray(np.random.normal(size=(64,256)))
D = np.asfortranarray(D / np.tile(np.sqrt((D*D).sum(axis=0)),(D.shape[0],1)),dtype= myfloat)
param = { 'L' : 20,
    'lambda1' : 0.15, 'numThreads' : 8, 'mode' : spams.PENALTY}
W = np.asfortranarray(np.random.random(size = (D.shape[1],X.shape[1])),dtype= myfloat)
tic = time.time()
alpha = spams.lassoWeighted(X,D,W,**param)
tac = time.time()
t = tac - tic
print "%f signals processed per second\n" %(float(X.shape[1]) / t)

得到一个64x1矩阵作为输出,它只包含一个非零元素。这对于每种情况都是一样的,每次它只给每个信号一个非零元素。我不明白为什么解在| | x−Dα| | 2+λ| | diag(w)α| | 1。将是一个只有一个非零元素??你知道吗


Tags: import元素sizedoctimehtmlnprandom
1条回答
网友
1楼 · 发布于 2024-10-04 03:20:17

输出矩阵alpha必须有10000列,因为X是64x10000,而字典是64x256(因为Da=X)。所以alpha应该是256x10000。查看Inria Spams文档,LassoWeighted的输出是:

Output:
   A: double sparse p x n matrix (output coefficients)

参数lambda1决定了非零的数目,因为它与l1正则化器相乘。它们的实现还有参数L,这是每个稀疏向量的最大非零数。你知道吗

所以如果我运行以下命令:

import spams
import numpy as np
import time

np.random.seed(0)
print "test lasso weighted"
X = np.asfortranarray(np.random.normal(size=(64,10000)))
X = np.asfortranarray(X / np.tile(np.sqrt((X*X).sum(axis=0)),(X.shape[0],1)),dtype=float)
D = np.asfortranarray(np.random.normal(size=(64,256)))
D = np.asfortranarray(D / np.tile(np.sqrt((D*D).sum(axis=0)),(D.shape[0],1)),dtype=float)
param = { 'L' : 20,
    'lambda1' : 0.15, 'numThreads' : 8, 'mode' : spams.PENALTY}
W = np.asfortranarray(np.random.random(size = (D.shape[1],X.shape[1])),dtype=float)
tic = time.time()
alpha = spams.lassoWeighted(X,D,W,**param)
tac = time.time()
t = tac - tic
non_zero = []
for col in alpha.T:
    non_zero.append(col.nnz)
print 'Shape Output Matrix:', alpha.shape
print 'Min non-zeros of %d columns: %d'%(alpha.shape[1], np.min(non_zero)) 
print 'Max non-zeros of %d columns: %d'%(alpha.shape[1], np.max(non_zero)) 
print "%f signals processed per second\n" %(float(X.shape[1]) / t)

我得到:

test lasso weighted
Shape Output Matrix: (256, 10000)
Min non-zeros of 10000 columns: 20
Max non-zeros of 10000 columns: 20
7691.130169 signals processed per second

所以10000个稀疏近似,实际上是256x1向量,每个都有20个非零。你知道吗

如果我们将params改为(最多5个非零):

param = { 'L' : 5,
    'lambda1' : 0.15, 'numThreads' : 8, 'mode' : spams.PENALTY}

输出:

test lasso weighted
Shape Output Matrix: (256, 10000)
Min non-zeros of 10000 columns: 5
Max non-zeros of 10000 columns: 5
26600.540090 signals processed per second

如果您想要更密集的稀疏近似(alpha列),您可以将L放大或将其全部去掉:

param = { 'lambda1' : 0.15, 'numThreads' : 8, 'mode' : spams.PENALTY}

输出:

test lasso weighted
Shape Output Matrix: (256, 10000)
Min non-zeros of 10000 columns: 40
Max non-zeros of 10000 columns: 61
1697.975321 signals processed per second

相关问题 更多 >