使用标签列表从pandas数据框中选择观测值的子集

2024-06-23 19:57:32 发布

您现在位置:Python中文网/ 问答频道 /正文

给定数据帧di

import pandas as pd
import numpy as np

data = {
    "Event": ['Biathlon', 'Ski Jump', 'Slalom', 'Downhill'],
    "Award": ['Gold', 'Bronze', 'Gold', 'Silver'],
    "Points":  ['100', '10', '100', '40']
}
d = pd.DataFrame(data)
di = d.set_index(["Award","Event"])

print(di)
                Points
Award  Event          
Gold   Biathlon    100
Bronze Ski Jump     10
Gold   Slalom      100
Silver Downhill     40

或者说我想选一个“两项全能”的奖项。。。为什么会失败?在

^{pr2}$

根据pandas documentation中的一个示例,这似乎应该可以工作。我复制了以下文档中的示例:

#example from http://pandas.pydata.org/pandas-docs/stable/advanced.html#using-slicers

def mklbl(prefix,n):
    return ["%s%s" % (prefix,i)  for i in range(n)]

miindex = pd.MultiIndex.from_product([mklbl('A',4),
                                     mklbl('B',2),
                                     mklbl('C',4),
                                     mklbl('D',2)])

micolumns = pd.MultiIndex.from_tuples([('a','foo'),('a','bar'),
                                                ('b','foo'),('b','bah')],
                                                names=['lvl0', 'lvl1'])

dfmi = pd.DataFrame(np.arange(len(miindex)*len(micolumns)).reshape((len(miindex),len(micolumns))),
index=miindex,
columns=micolumns).sort_index().sort_index(axis=1)

dfmi.loc[(slice('A1','A3'),slice(None), ['C1','C3']),:]

#this also works
dfmi.loc[(['A1','A3'],['B0','B1'], ['C1','C3']),:]

Tags: fromimporteventpandasindexlenaspd

热门问题