选择值背后的逻辑

2条回答

网友

1楼 · 编辑于 2024-06-28 19:54:52

这段代码解释了在pandas中使用LOC访问数据（切片）的不同方法：

df=pd.DataFrame(np.random.rand(6,4),index=['row1','row2','row3','row4','row5','row6'],columns=list('ABCD'))
         A         B         C         D
row1  0.972614  0.193116  0.448413  0.731300
row2  0.135391  0.783295  0.959058  0.107872
row3  0.966703  0.742793  0.852716  0.710681
row4  0.976819  0.920898  0.665329  0.078999
row5  0.418717  0.122677  0.716004  0.977522
row6  0.101422  0.641862  0.157751  0.888720

行范围列范围：

df.loc['row1':'row3', 'A':'C']

            A         B         C
row1  0.972614  0.193116  0.448413
row2  0.135391  0.783295  0.959058
row3  0.966703  0.742793  0.852716

值行列范围：

df.loc[['row1','row3'], 'A':'C']
             A         B         C
row1  0.972614  0.193116  0.448413
row3  0.966703  0.742793  0.852716

行范围列值：

df.loc['row1':'row3', ['A','C']]
            A         C
row1  0.972614  0.448413
row2  0.135391  0.959058
row3  0.966703  0.852716

单一值：

df.loc['row1','A'])

0.972614309371533

结论：当使用范围时，不要把它放在[] 但是使用[]包含一个值列表。你知道吗

网友

2楼 · 编辑于 2024-06-28 19:54:52

您需要将list的值转换为datetime，因为DatetimeIndex，这意味着需要相同类型的list值和DataFrame的index/columns值，否则KeyError：

print(df.loc[pd.to_datetime(['20130102','20130104']),['A','B']])
                   A         B
2013-01-02  0.719469  0.423106
2013-01-04  0.438572  0.059678

按索引/列的第一个和最后一个值选择

转换成datetimes是没有必要的，因为partial string indexing。你知道吗

对于按范围选择仅删除列表[]用于选择列：

print(df.loc['20130102':'20130104','A':'C'])
                   A         B         C
2013-01-02  0.719469  0.423106  0.980764
2013-01-03  0.480932  0.392118  0.343178
2013-01-04  0.438572  0.059678  0.398044

选择日期时间的类似解决方案：

print(df.loc['2013-01-02':'2013-01-04','A':'C'])
                   A         B         C
2013-01-02  0.719469  0.423106  0.980764
2013-01-03  0.480932  0.392118  0.343178
2013-01-04  0.438572  0.059678  0.398044

组合：

#select betwen start/end datetime and only columns A,C
print(df.loc['20130102':'20130104',['A','C']])
                   A         C
2013-01-02  0.719469  0.980764
2013-01-03  0.480932  0.343178
2013-01-04  0.438572  0.398044

#select only 20130102, 20130104 index and columns between A and C
print(df.loc[pd.to_datetime(['20130102','20130104']),'A':'C'])
                   A         B         C
2013-01-02  0.719469  0.423106  0.980764
2013-01-04  0.438572  0.059678  0.398044

相关问题更多 >

编程相关推荐

热门问题

热门文章

选择值背后的逻辑

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >