为什么我只有在不使用今天的日期时才有不同的数组维度?

2024-10-03 06:18:13 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图获得一家公司的股票数据,并预测未来的股价。我知道这是不准确的,但我用它作为一个学习工具。当使用今天的日期作为结束日期,使用预测的日期作为将来的日期时,我的代码似乎可以工作。但是,当使用过去的日期并预测未来时,会产生一个错误:

ValueError:x和y必须具有相同的第一个维度,但具有形状(220,)和(221,)

我想这样做,因为这样我就可以比较预测和实际价格。你知道吗

import numpy as np
import datetime
import pandas_datareader as web
import statistics
import matplotlib.pyplot as plt
import pandas as pd
from matplotlib import style
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

stock_name = 'BP.L'

prices = web.DataReader(stock_name, 'yahoo', start = '2019-01-01', end = '2019-11-05').reset_index(drop = False)[['Date', 'Adj Close']]


#plt.plot(prices['Date'], prices['Adj Close'])
#plt.xlabel('Days')
#plt.ylabel('Stock Prices')
#plt.show()


# Parameter Definitions

# So    :   initial stock price
# dt    :   time increment -> a day in our case
# T     :   length of the prediction time horizon(how many time points to predict, same unit with dt(days))
# N     :   number of time points in the prediction time horizon -> T/dt
# t     :   array for time points in the prediction time horizon [1, 2, 3, .. , N]
# mu    :   mean of historical daily returns
# sigma :   standard deviation of historical daily returns
# b     :   array for brownian increments
# W     :   array for brownian path

start_date = '2018-01-01'
end_date = '2019-01-01'
pred_end_date = '2019-11-05'

# We get daily closing stock prices
S_eon = web.DataReader(stock_name, 'yahoo', start_date, end_date).reset_index(drop = False)[['Date', 'Adj Close']]

So = S_eon.loc[S_eon.shape[0] -1, "Adj Close"]

dt = 1

n_of_wkdays = pd.date_range(start = pd.to_datetime(end_date, 
              format = "%Y-%m-%d") + pd.Timedelta('1 days'), 
              end = pd.to_datetime(pred_end_date, 
              format = "%Y-%m-%d")).to_series(
              ).map(lambda x: 
              1 if x.isoweekday() in range(1,6) else 0).sum()
T = n_of_wkdays

N = T / dt

t = np.arange(1, int(N) + 1)

returns = (S_eon.loc[1:, 'Adj Close'] - \
          S_eon.shift(1).loc[1:, 'Adj Close']) / \
          S_eon.shift(1).loc[1:, 'Adj Close']

mu = np.mean(returns)

sigma = np.std(returns)


scen_size = 10000
b = {str(scen): np.random.normal(0, 1, int(N)) for scen in range(1, scen_size + 1)}

W = {str(scen): b[str(scen)].cumsum() for scen in range(1, scen_size + 1)}

drift = (mu - 0.5 * sigma**2) * t
diffusion = {str(scen): sigma * W[str(scen)] for scen in range(1, scen_size + 1)}

S = np.array([So * np.exp(drift + diffusion[str(scen)]) for scen in range(1, scen_size + 1)]) 
S = np.hstack((np.array([[So] for scen in range(scen_size)]), S))
S_avg = np.mean(S)
print(S_avg)

#Plotting 
plt.figure(figsize = (20,10))
for i in range(scen_size):
    plt.title("Daily Volatility: " + str(sigma))
    plt.plot(pd.date_range(start = S_eon["Date"].max(), 
                end = pred_end_date, freq = 'D').map(lambda x:
                x if x.isoweekday() in range(1, 6) else np.nan).dropna(), S[i, :])
    plt.ylabel('Stock Prices, €')
    plt.xlabel('Prediction Days')
plt.show()

错误显示:

“文件”C:\Users\User\Anaconda3\lib\site packages\matplotlib\axes_基本.py“,第270行,从\u xy到\u xy” 有{}和{}.格式(x.shape,y.shape)


Tags: inimportforclosesizedatetimenp
3条回答

我更改了以下内容,现在可以工作了:

"x if x.isoweekday() in range(1, 6) else np.nan).dropna(), S[i, :])"

收件人:

"x if x.isoweekday() in range(1, 6) else np.nan).dropna(), S[i, :-1])"

你能试着在预测结束日期前多加一天吗?你知道吗

pred_end_date = '2019-11-06'

您的错误只是形状不匹配,日期序列只丢失一个值

根据documentation

date.isoweekday()

Return the day of the week as an integer, where Monday is 1 and Sunday is 7. For example, date(2002, 12, 4).isoweekday() == 3, a Wednesday. See also weekday(), isocalendar().

这将返回一个介于1和7之间的数字,您将检查范围1到6,并将其他值转换为na。然后你dropna它们,所以你失去了一个值。你知道吗

将其更改为x if x.isoweekday() in range(1, 7),它应该可以工作。你知道吗

相关问题 更多 >