Pandas：切片多索引数据帧。。。需要简单系列 - 问答 - Python中文网

Pandas：切片多索引数据帧。。。需要简单系列

2024-10-02 20:36:16 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

通过从datareader获取一个面板并将其转换为一个多索引dataframe，我创建了一个股票数据的多索引。有时当我使用.loc时，我得到一个有一个索引的序列，有时我得到一个有两个索引的序列。如何按日期切片并获得一个索引的序列？代码将有助于。。。你知道吗

import pandas_datareader.data as web

# Define the securities to download
symbols = ['AAPL', 'MSFT']

# Define which online source one should use
data_source = 'yahoo'

# Define the period of interest
start_date = '2010-01-01'
end_date = '2010-12-31'

# User pandas_reader.data.DataReader to load the desired data. 
panel = web.DataReader(symbols, data_source, start_date, end_date)

# Convert panel to multiindex dataframe
midf = panel.to_frame()

# for slicing multiindex dataframes it must be sorted
midf = midf.sort_index(level=0)

在这里，我选择我想要的列：

adj_close = midf['Adj Close']
adj_close.head()

我得到一个有两个索引的序列（Date和minor）：

Date        minor
2010-01-04  AAPL     27.505054
            SPY      96.833946
2010-01-05  AAPL     27.552608
            SPY      97.090271
2010-01-06  AAPL     27.114347
Name: Adj Close, dtype: float64

现在，我使用:选择apple来选择所有日期。你知道吗

aapl_adj_close = adj_close.loc[:, 'AAPL']
aapl_adj_close.head()

得到一个索引为Date的序列。这就是我要找的！你知道吗

Date
2010-01-04    27.505054
2010-01-05    27.552608
2010-01-06    27.114347
2010-01-07    27.064222
2010-01-08    27.244156
Name: Adj Close, dtype: float64

但当我实际按日期切片时，我没有得到这个系列：

sliced_aapl_adj_close  = adj_close.loc['2010-01-04':'2010-01-06', 'AAPL']
sliced_aapl_adj_close.head()

我得到一个有两个指数的序列：

Date        minor
2010-01-04  AAPL     27.505054
2010-01-05  AAPL     27.552608
2010-01-06  AAPL     27.114347
Name: Adj Close, dtype: float64

切片是正确的，值是正确的，但我不想在那里的次要索引（因为我想通过这个系列来绘图）。切这个的正确方法是什么？你知道吗

谢谢！你知道吗

Tags： to close data date 切片序列 loc aapl

1条回答

网友

1楼 · 发布于 2024-10-02 20:36:16

您可以使用：

df = df.reset_index(level=1, drop=True)

或：

df.index = df.index.droplevel(1)

另一种解决方案是通过^{}为DataFrame重塑形状，然后通过[]选择：

df = adj_close.unstack()

print (df)
minor            AAPL        SPY
Date                            
2010-01-04  27.505054  96.833946
2010-01-05  27.552608  97.090271
2010-01-06  27.114347        NaN

print (df['AAPL'])

Date
2010-01-04    27.505054
2010-01-05    27.552608
2010-01-06    27.114347
Name: AAPL, dtype: float64

相关问题更多 >

编程相关推荐

热门问题

热门文章