这个向量是怎么工作的？

f = open('/Users/nk/Vocab.txt','r') vocab_temp = f.read().split() f.close() col = len(vocab_temp) print("Training column size:") print(col) row = run('cat '+'/Users/nk/X_train.txt'+" | wc -l").split()[0] print("Training row size:") print(row) matrix_tmp = np.zeros((int(row),col), dtype=np.int64) print("Train Matrix size:") print(matrix_tmp.size) label_tmp = np.zeros((int(row)), dtype=np.int64) f = open('/Users/nk/X_train.txt','r') count = 0 for line in f: line_tmp = line.split() #print(line_tmp) for word in line_tmp[0:]: if word not in vocab_temp: continue matrix_tmp[count][vocab_temp.index(word)] = 1 count = count + 1 f.close()

1条回答

网友

1楼 · 发布于 2024-10-05 12:22:47

matrix_tmp[count][vocab_temp.index(word)] = 1 如果查看代码，count每行递增1。所以matrix_tmp[count]是每行的单词向量。你知道吗

现在，考虑到vocab_temp.index(word)，您可以在第二行中看到vocab_temp保留了由f.read().split()产生的向量。你知道吗

事实上，它从vocab_temp中获取索引，实际上它获取了矩阵的位置（矩阵中单词word所在的索引，并将其设置为1（单词word出现在index位置）。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章