Pandas系列名称未显示为数据帧的一部分

import pandas as pd from io import StringIO tibble3_csv = """country,year,cases,population Afghanistan,1999,745,19987071 Afghanistan,2000,2666,20595360""" with StringIO(tibble3_csv) as fp: tibble3 = pd.read_csv(fp) def str_join_elements(x, sep=""): assert type(sep) is str return sep.join((str(xi) for xi in x)) def unite(df, cols, new_var, combine=str_join_elements): def apply_join(x, combine): joinstr = combine(x) ser = pd.Series(joinstr, name=new_var) print(ser.name) return ser fixed_vars = df.columns.difference(cols) tibble = df[fixed_vars].copy() tibble_extra = df[cols].apply(apply_join, combine=combine, axis=1) return pd.concat([tibble, tibble_extra], axis=1) tab = unite(tibble3, ['cases', 'population'], 'rate', combine=lambda x: str_join_elements(x, "/")) print(tab)

2条回答

网友

1楼 · 编辑于 2024-09-25 18:26:17

您也可以尝试使用

>>> tab = tab.rename(columns = {0:'cases/population'})
>>> tab
       country  year cases/population
0  Afghanistan  1999     745/19987071
1  Afghanistan  2000    2666/20595360
>>>

网友

2楼 · 编辑于 2024-09-25 18:26:17

如果您试图连接未知数量的列，可以将apply与str.join一起使用：

def foo(df, columns, col_name, sep=''):
    s = df[columns].apply(lambda x: sep.join(map(str, x)), 1)
    s.name = col_name
    return pd.concat([df[df.columns.difference(columns)], s], axis=1)

df
       country  year  cases  population
0  Afghanistan  1999    745    19987071
1  Afghanistan  2000   2666    20595360

df2 = foo(df, ['cases', 'population'], 'rate', '/')
df2
       country  year           rate
0  Afghanistan  1999   745/19987071
1  Afghanistan  2000  2666/20595360

如果总是两列，那么可以使用str.cat，这样会更快。在

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章