在Python中使用Seaborn为bigrams和trigrams创建副本

# TF-IDF on unigrams, bigrams and trigrams tfidf_words = TfidfVectorizer(sublinear_tf=True, min_df=0, norm='l2', encoding='latin-1', ngram_range=(1,1), stop_words='english') # vectorize for bigrams tfidf_bigrams = TfidfVectorizer(sublinear_tf=True, min_df=0, norm='l2', encoding='latin-1', ngram_range=(2,2), stop_words='english') # vecorize for trigrams tfidf_trigrams = TfidfVectorizer(sublinear_tf=True, min_df=0, norm='l2', encoding='latin-1', ngram_range=(3,3), stop_words='english') # Transform and fit each of the outputs from TF-IDF (unigrams, bigrams and trigrams) x_train_words = tfidf_words.fit_transform(x_train_sm.preprocessed).toarray() # bigrams x_train_bigrams = tfidf_bigrams.fit_transform(x_train_sm.preprocessed).toarray() #trigrams x_train_trigrams = tfidf_trigrams.fit_transform(x_train_sm.preprocessed).toarray()

# create blank dataframe with an index equal to number of CV folds * number of models tested cv_trigrams = pd.DataFrame(index=range(CV * len(models))) # clear the previous list called 'entries' that was populated with values entries = [] # calculate the accuracy at each fold and populate the results in the 'entries' list # populate the dataframe 'cv_trigrams' with the fold and accuracy score at each fold i = 0 for model in models: #model_name = #model.__class__.__name__ model_name = names[i] # model => the model that will be used to fit the data # x_train_trigrams => data that is to be fitted by the selected model (trigrams) # y_train_sm => y training data after oversampling (event_id) # scoring => the type of score you want the function 'cross_val_score' to return # cv = number of folds you want to performed with cross-validation accuracies = cross_val_score(model, x_train_trigrams, y_train_sm, scoring ='accuracy', cv=CV) for fold_idx, accuracy in enumerate(accuracies): entries.append((model_name, fold_idx, accuracy)) cv_trigrams = pd.DataFrame(entries, columns=['model_name_trigrams', 'fold_idx', 'accuracy']) i = i + 1

# plot the results of each model as a box plot box_bigrams = sns.boxplot(x='model_name_bigrams', y='accuracy', data=cv_bigrams) box_bigrams = sns.boxplot(x='model_name_bigrams', y='accuracy', data=cv_bigrams) fig_bigrams = box_bigrams.get_figure() fig_bigrams.savefig('boxplot_bigrams.png')

# plot the results of each model as a box plot box_trigrams = sns.boxplot(x='model_name_trigrams', y='accuracy', data=cv_trigrams) box_trigrams = sns.boxplot(x='model_name_trigrams', y='accuracy', data=cv_trigrams) fig_trigrams = box_trigrams.get_figure() fig_trigrams.savefig('boxplot_trigrams.png')

1条回答

网友

1楼 · 发布于 2024-10-02 12:25:07

回应@ImportanceOfBeingErnest的评论，你的代码太复杂了，你的问题也不够清楚。你想创建3个不同的图形，每种情况一个（单数、二元和三元）？您是否尝试使用一个具有3个轴的图形（matplotlib中称为子图）？你想把三个箱子并排放在一张图上吗？在

对我来说，最简单的方法是创建一个包含3个子图的图形，如下所示：

fig, (ax1, ax2, ax3) = plt.subplots(3,1, figsize=(xx,yy))  # choose appropriate size to fit your needs
sns.boxplot(x='model_name_unigrams', y='accuracy', data=cv_unigrams, ax=ax1)
sns.boxplot(x='model_name_bigrams', y='accuracy', data=cv_bigrams, ax=ax2)
sns.boxplot(x='model_name_trigrams', y='accuracy', data=cv_trigrams, ax=ax3)
fig.savefig('your_figure_name_here.png')

请参阅subplots demo here和有关^{}或^{}的文档。在the documentation for ^{}中，您将看到它是一个“轴级别”函数，这意味着您可以要求它在您选择的任何轴对象上绘图

相关问题更多 >

编程相关推荐

热门问题

热门文章