我是Spacy的初学者。最近,我正在用小数据集的空间来建立实体识别模型。我做了csv文件,其中包含加拿大城市信息,如国家,城市,省,邮政地址等 https://dataturks.com免费的NER标签服务来标记我的行元素 他们提供了一个convertDataturkSpacy()方法来提供spacy可兼容的json格式。 到目前为止,一切都很顺利,但我正在
TypeError: 'NoneType' object is not iterable
这是我的片段
import json
import logging
import spacy
import random
from spacy.util import minibatch, compounding
trainingfilename="C:/Users/codemen/Desktop/Timeseries Analytics/Canadianinfo.json"
logging.basicConfig(level=logging.INFO)
def ConvertDataturkToSpacy(trainingfilename):
try:
trainingData=[]
lines=[]
# reading file and formating part
with open(trainingfilename,'r') as f:
lines=f.readlines()
for line in lines:
data=json.loads(line)
text=data['content']
entities=[]
print('entties',entities)
for annotation in data['annotation']:
#print("Here is the thing")
point=annotation['points'][0] #single point annotation part
#print(point)
labels=annotation['label']
print("isintance",labels)
if not isinstance(labels,list):#handling both list of labels or single label
labels=[labels]
print(labels)
for label in labels:
#dataturks indices are inclusive but spacy indices are not so dealing with it by adding with +1
#print("Test here")
entities.append((point['start'],point['end']+1,label))
trainingData.append((text,{"entities":entities}))
return trainingData
except Exception as e:
logging.exception("Unable to process item" + trainingfilename +"\n"+ "errror ="+str(e))
return None
TrainingData=ConvertDataturkToSpacy(trainingfilename)
到目前为止,我已经发现我初始化的空列表实体[]不是在迭代过程中追加和更新的。在
目前没有回答
相关问题 更多 >
编程相关推荐