在python中有没有更有效的方法来转换一天中OHLC数据帧的周期性

<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 540949 entries, 2007-01-02 09:46:00+08:00 to 2013-10-17 16:15:00+08:00 Data columns (total 5 columns): Open 540949 non-null values High 540949 non-null values Low 540949 non-null values Close 540949 non-null values Volume 540949 non-null values dtypes: int64(5)

def ohlcsum(df): df = df.sort() return { 'Open': df['Open'][0], 'High': df['High'].max(), 'Low': df['Low'].min(), 'Close': df['Close'][-1], 'Volume': df['Volume'].sum() } xx.groupby('date').agg(ohlcsum)

2条回答

网友

1楼 · 编辑于 2024-09-29 19:33:58

在master/0.13（很快发布）中，您可以这样做（在0.12中，这是一个手动操作，因为您必须在系列中单独执行）

In [7]: df = DataFrame(np.random.randn(10000,2),index=date_range('20130101 09:00:00',periods=10000,freq='1Min'),columns=['last','volume'])

In [8]: df.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 10000 entries, 2013-01-01 09:00:00 to 2013-01-08 07:39:00
Freq: T
Data columns (total 2 columns):
last      10000  non-null values
volume    10000  non-null values
dtypes: float64(2)
In [9]: df.resample('1D',how='ohlc')
Out[9]: 
                last                                  volume                              
                open      high       low     close      open      high       low     close
2013-01-01  0.801982  3.343166 -3.203291 -0.361502  0.255356  2.723863 -3.319414  1.073376
2013-01-02  0.101687  3.378843 -3.219792 -1.121900  1.226099  4.103099 -3.463014 -0.452594
2013-01-03 -0.051806  4.290010 -4.099700 -0.637321  0.713189  3.622728 -3.236652 -0.104458
2013-01-04  0.821215  3.058024 -3.907862 -1.595449  0.836234  2.821551 -3.191774 -0.399603
2013-01-05  0.084973  3.458210 -3.191455  1.426380 -0.402435  2.777447 -2.966165  1.227398
2013-01-06 -0.669922  3.232865 -3.902237  1.846017 -0.440055  3.088109 -3.710640  3.066725
2013-01-07 -0.122727  3.300163 -3.315501  1.718163  1.085066  3.373251 -4.029679  0.187828
2013-01-08  0.311785  3.073488 -3.013702 -0.627721 -0.502258  2.795292 -2.772738 -0.654676

[8 rows x 8 columns]

这将在0.12下工作

^{pr2}$

网友

2楼 · 编辑于 2024-09-29 19:33:58

我对pandas和python还很陌生，但我想出了一个允许转换到任何时间段的方法

在我的例子中，minData是分钟数据，以没有任何逗号的平面格式存储。我的数据来自quantquote.com网站在

columnHeadings = ['Date', 'Time', 'Open', 'High', 'Low', 'Close', 'Volume', 'Split Factor', 'Earnings', 'Dividends']

minData = pd.read_csv(
    filename,
    header = None,
    names = columnHeadings, 
    parse_dates = [["Date", "Time"]],
    date_parser = lambda x: datetime.datetime.strptime(x, '%Y%m%d %H%M'), 
    index_col = "Date_Time",
    sep=' ')

xx = minData.to_period(freq="min")

openCol = DataFrame(xx.Open)
openCol2 = openCol.resample("M", how = 'first')

highCol = DataFrame(xx.High)
highCol2 = highCol.resample("M", how = 'max')

lowCol = DataFrame(xx.Low)
lowCol2 = lowCol.resample("M", how = 'min')

closeCol = DataFrame(xx.Close)
closeCol2 = closeCol.resample("M", how = 'last')

volumeCol = DataFrame(xx.Volume)
volumeCol2 = volumeCol.resample("M", how = 'sum')

#splitFactorCol = DataFrame(xx.SplitFactor)
#splitFactorCol.resample("M", how = 'first')


monthlyData = DataFrame(openCol2.Open)

monthlyData["High"] = highCol2.High
monthlyData["Low"] = lowCol2.Low
monthlyData["Close"] = closeCol2.Close
monthlyData["Volume"] = volumeCol2.Volume

我相信一定有一个更简洁的方法，但这与我的数据一起工作，它允许我使用相同的代码生成15分钟，1小时，每天，每周和每月。而且速度很快。在

如有任何改进/意见，我们将不胜感激。在

谨致问候

-杰森

相关问题更多 >

编程相关推荐

热门问题

热门文章