字符串列的相关和共现图

2024-10-02 20:41:58 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含amny列的csv文件。其中所有列都是对象类型。我想看看两个对象列之间的共现和相关图。这两列可以在下面的DataFrame中看到。你知道吗

我试着按照一些方法找到一些出路,但我不能解决这个问题。我的答案之一是this

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

df = {'intents': ['BookRestaurant',
                        'SearchScreeningEvent',
                        'movieSearch',
                        'play_music',
                        'BookRestaurant'],
             'entities': [['restaurant_name', 'spatial_relation', 'poi'],
                          ['artist', 'screening_location'],
                          ['movie_name', 'location_name', 'actor_name'],
                          ['artist', 'music_item', 'playlist', 'playlist_owner'],
                          ['Indian_restaurant', 'spicy_food', 'Hamburg']]}
df = pd.DataFrame(dataframe)


    intents     entities
0   BookRestaurant  [restaurant_name, spatial_relation, poi]
1   SearchScreeningEvent    [artist, screening_place]
2   movieSearch     [movie_name, location_name, actor_name]
3   play_music  [artist, music_item, playlist, playlist_owner]
4   BookRestaurant  [Indian_restaurant, spicy_food, Hamburg]

dataframe['entities']= dataframe['entities'].apply(', '.join)
dummy = pd.get_dummies(dataframe['intents'])

Dict = {}
Index = list(set(dataframe["intents"]))
print(Index)
for i, e in enumerate(dataframe["entities"]):
    one_hot = list(dummy[dataframe["intents"][i]])
    print(one_hot)
    if e not in Dict.keys():
        Dict[e] = one_hot
    else:
        Dict[e] = Dict[e] + one_hot
df = pd.DataFrame(Dict).T
fig, ax = plt.subplots()
sns.heatmap(df)
plt.show()

这显然不是正确的方法,因为intentsentities在情节中是独立的。提前谢谢。你知道吗


Tags: namedataframedfartistmusiconerestaurantplaylist