我有三个列表包含以下数据:
Entities: ['Ashraf', 'Afghanistan', 'Afghanistan', 'Kabul']
Relations: ['Born', 'President', 'employee', 'Capital', 'Located', 'Lecturer', 'University']
sentence_list: ['Ashraf','Born', 'in', 'Kabul', '.' 'Ashraf', 'is', 'the', 'president', 'of', 'Afghanistan', '.', ...]
因为sentence_list
是一个句子列表。在每个句子中,我想检查是否有Entities
和Relations
的任何单词,特定单词的组合应该添加到另一个列表中。例如,第一句中的(Ashraf, born, Kabul
)。你知道吗
我所做的:
第一个不完整的解决方案:
# read file
with open('../data/parse.txt', 'r') as myfile:
json_data = json.load(myfile)
for i in range(len(json_data)): # the dataset was in json format
if json_data[i]['word'] in relation(json_data)[0]: # I extract the relations
print(json_data[i]['word'])
if json_data[i]['word'] in entities(json_data)[0]:
print(json[i]['word'])
输出:(Ashraf, Born, Ashraf)
,我想要(Ashraf, Born, Kabul)
下一个不完整的解决方案:我将json_data
存储到一个列表中,然后执行以下操作:
json_data2 = []
for i in range(len(json_data)):
json2_data.append(json_data[i]['word'])
print(json_data2)
'''
Now I tried if I can find any element of `Entities` list and `Relations` list
in each sentence of `sentence_list`. And then it should store matched
entities and relations based on sentence to a list. '''
for line in json_data2:
for rel in relation(obj):
for ent in entities(obj):
match = re.findall(rel, line['word'])
if match:
print('word matched relations: %s ==> word: %s' % (rel, line['address']))
match2 = re.findall(ent, line['word'])
if match2:
print('word matched entities: %s ==> word: %s' % (ent, line['address']))
不幸的是,没有工作?你知道吗
您可以使用以下list comprehension:
输出
请注意,我正在返回一个
sets
列表以避免重复值,例如在Entities
Afghanistan
中出现两次。你知道吗有用的阅读:
List comprehensions
sets — Unordered collections of unique elements
string methods
相关问题 更多 >
编程相关推荐