在Pandas中使用Python将列数据操作为新格式

CarModel ProductionData ProductionYear BMWX1 55000 2005 Accord 100000 2005 BMWX1 34000 2006 Accord 110000 2006 BMWX1 43000 2007 Accord 105000 2007

2条回答

网友

1楼 · 编辑于 2024-10-02 08:29:56

设置

text = """CarModel ProductionData ProductionYear
BMWX1    55000          2005
Accord   100000         2005
BMWX1    34000          2006
Accord   110000         2006
BMWX1    43000          2007
Accord   105000         2007"""

df = pd.read_csv(StringIO(text), delim_whitespace=1)

解决方案

gb = df.set_index('CarModel').groupby(level=0)

def proc_df(df):
    # Add this column becuase OP has it in final output
    df['Year2'] = df.ProductionYear + 1

    columns = ['ProductionYear', 'Year2', 'ProductionData']

    # Return ndarray gets flattened to string when returned via apply
    return df[columns].values

gb.apply(proc_df)

看起来像：

CarModel
Accord    [[2005, 2006, 100000], [2006, 2007, 110000], [...
BMWX1     [[2005, 2006, 55000], [2006, 2007, 34000], [20...
dtype: object

网友

2楼 · 编辑于 2024-10-02 08:29:56

下面生成您描述的输出。在CarModel上分组（作为column或移动到index），然后返回相应的列作为.values。你知道吗

df['Year2'] = df.ProductionYear.add(1)
df.groupby('CarModel').apply(lambda x: x.loc[:, ['ProductionYear', 'Year2', 'ProductionData']].values)

CarModel
Accord    [[2005, 2006, 100000], [2006, 2007, 110000], [...
BMWX1     [[2005, 2006, 55000], [2006, 2007, 34000], [20...
dtype: object

设置

解决方案

相关问题更多 >

编程相关推荐

热门问题

热门文章

在Pandas中使用Python将列数据操作为新格式

设置

解决方案

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >