在Pandas.plotting.parallel_坐标中的打印顺序

... data retrieval and praparation from a couple of Excel files ---> output = 'largeDataFrame' theColormap: ListedColormap = cm.get_cmap('some cmap name') # This is a try to stack the lines in the right order.. (doesn't work) largeDataFrames.sort_values(column_for_line_color_derivation, inplace=True, ascending=True) # here comes the actual plotting of data sns.set_style('ticks') sns.set_context('paper') plt.figure(figsize=(10, 6)) thePlot: plt.Axes = parallel_coordinates(largeDataFrame, class_column=column_for_line_color_derivation, cols=[columns to plot], color=theColormap.colors) plt.title('My Title') thePlot.get_legend().remove() plt.xticks(rotation=90) plt.tight_layout() plt.show()

1条回答

网友

1楼 · 发布于 2024-10-03 00:21:53

我使用pandas版本1.1.2和1.0.3运行了一些测试，在这两种情况下，线都是从着色列的低值到高值绘制的，与数据帧顺序无关

您可以临时添加parallel_coordinates(...., lw=5)，这非常清楚。对于细线，顺序不太明显，因为黄线的对比度较小

参数sort_labels=似乎与其名称具有相反的效果：当False（默认）时，线按排序顺序绘制，当True时，它们保持数据帧顺序

以下是一个可重复的小示例：

import numpy as np
import pandas as pd
from pandas.plotting import parallel_coordinates
import matplotlib.pyplot as plt

df = pd.DataFrame({ch: np.random.randn(100) for ch in 'abcde'})
df['coloring'] = np.random.randn(len(df))

fig, axes = plt.subplots(ncols=2, figsize=(14, 6))
for ax, lw in zip(axes, [1, 5]):
    parallel_coordinates(df, class_column='coloring', cols=df.columns[:-1], colormap='viridis', ax=ax, lw=lw)
    ax.set_title(f'linewidth={lw}')
    ax.get_legend().remove()
plt.show()

一个想法是根据类别更改线宽：

fig, ax = plt.subplots(figsize=(8, 6))

parallel_coordinates(df, class_column='coloring', cols=df.columns[:-1], colormap='viridis', ax=ax)
num_lines = len(ax.lines)
for ind, line in enumerate(ax.lines):
    xs = line.get_xdata()
    if xs[0] != xs[-1]:  # skip the vertical lines representing axes
        line.set_linewidth(1 + 3 * ind / num_lines)
ax.set_title(f'linewidth depending on class_column')
ax.get_legend().remove()
plt.show()

相关问题更多 >

编程相关推荐

热门问题

热门文章