标签和实体有两个变量。实体变量存储word的列表,该列表中的每个元素也包含list。所以它是一个列表变量列表。这个列表实际上是一个双gram特性,所以我需要保留它。在
我试着用这两个变量训练分类器。到目前为止我的代码是:
from sklearn import svm
from sklearn.feature_extraction.text import TfidfVectorizer
entity = [[['Prabowo Subianto']], [['Muhtar Ependi']], [['Nina Zatulini']], [['Partai Gerindra']], [['Persiba']], [['Partai Kebangkitan Bangsa (PKB)'], ['Partai Kebangkitan'], ['Kebangkitan Bangsa'], ['Bangsa ('], ['( PKB'], ['PKB )']], [['Sman 3 Kabupaten Tangerang'], ['Sman 3'], ['3 Kabupaten'], ['Kabupaten Tangerang']], [['Bandara Changi Singapura'], ['Bandara Changi'], ['Changi Singapura']], [['Warung Kopi Kita'], ['Warung Kopi'], ['Kopi Kita']]]
label = ['PERSON', 'PERSON', 'PERSON', 'ORGANIZATION', 'ORGANIZATION', 'ORGANIZATION', 'LOCATION', 'LOCATION', 'LOCATION']
vectorizer = TfidfVectorizer(min_df=1)
train_vector_entity = vectorizer.fit_transform(entity)
train_vector_label = label
classifier = svm.SVC()
classifier_word = classifier.fit(train_vector_entity,train_vector_label)
错误结果:
^{pr2}$训练分类器的最佳方法是什么? 谢谢
只需更改此行:
相关问题 更多 >
编程相关推荐