用于大规模杀伤性武器相似性的循环列

,source_Label,source_uri 'neuronal ceroid lipofuscinosis 8',"http://purl.obolibrary.org/obo/DOID_0110723" 'autosomal dominant distal hereditary motor neuronopathy',"http://purl.obolibrary.org/obo/DOID_0111198"

,source_label, target_label, source_uri, target_uri, wmd score 'neuronal ceroid lipofuscinosis 8', 'neuronal ceroid ', "http://purl.obolibrary.org/obo/DOID_0110723", "http://purl.obolibrary.org/obo/DOID_0110748", 0.98 'autosomal dominant distal hereditary motor neuronopathy', 'autosomal dominanthereditary', "http://purl.obolibrary.org/obo/DOID_0111198", "http://purl.obolibrary.org/obo/DOID_0111110", 0.65

list_distances = [] temp = [] def preprocess(sentence): return [w for w in sentence.lower().split()] entity = df1['source_label'] target = df2['target_label'] for i in tqdm(entity): for j in target: wmd_distance = model.wmdistance(preprocess(i), preprocess(j)) temp.append(wmd_distance) list_distances.append(min(temp)) # print("list_distances", list_distances) WMD_Dataframe = pd.DataFrame({'source_label': pd.Series(entity), 'target_label': pd.Series(target), 'source_uri': df1['source_uri'], 'target_uri': df2['target_uri'], 'wmd_Score': pd.Series(list_distances)}).sort_values(by=['wmd_Score']) WMD_Dataframe = WMD_Dataframe.reset_index()

1条回答

网友

1楼 · 发布于 2024-05-18 15:20:11

快速修复：

closest_neighbour_index_df2 = []


def preprocess(sentence):
    return [w for w in sentence.lower().split()]



 
for i in tqdm(entity):
    temp = []
    for j in target:
        wmd_distance = model.wmdistance(preprocess(i), preprocess(j))
        temp.append(wmd_distance)
    # maybe assert to make sure its always right
    closest_neighbour_index_df2.append(np.argmin(np.array(temp))) 
    # return argmin to return index rather than the value. 
    
# Add the indices from df2 to df1

df1['closest_neighbour'] = closest_neighbour_index_df2 
# add information to respective row from df2 using the closest_neighbour column

df1

df2

预期结果

相关问题更多 >

编程相关推荐

热门问题

热门文章