擅长:python、mysql、java
<p>您可以立即创建<code>csr_matrix</code>(类似于以下格式:<code>csr_matrix((data, (row_ind, col_ind))</code>)。这是一个关于如何做到这一点的片段。</p>
<pre><code>import scipy.sparse as sp
d = {0: [0,1], 1: [1,2,3],
2: [3,4,5], 3: [4,5,6],
4: [5,6,7], 5: [7],
6: [7,8,9]}
row_ind = [k for k, v in d.items() for _ in range(len(v))]
col_ind = [i for ids in d.values() for i in ids]
X = sp.csr_matrix(([1]*len(row_ind), (row_ind, col_ind))) # sparse csr matrix
</code></pre>
<p>您可以使用matrix<code>X</code>在稍后(即<code>X.T * X</code>)找到共现矩阵(credit github@daniel acuna)。我想有一种更快的方法可以将列表字典转换成<code>row_ind</code>,<code>col_ind</code>。</p>