面板数据拆分

2024-09-30 05:16:12 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试将我的数据拆分为训练集和测试集,以运行LTSM。ID列下有几个国家,我根据日期间隔拆分数据,如下所示。我的意图是为每个国家的特定时间间隔分割数据。但我得到了一个错误,它只指出

KeyError: False

During handling of the above exception, another exception occurred:

这是我的密码:

def train_test_split(data):
    mask1 = (data['Date'] >= '2020-04') & (data['Date'] <= '2020-05')
    test=data.loc[mask1]
    mask2 = (data['Date'] >= '2014-01') & (data['Date'] <= '2020-03')
    train=data.loc[mask2]
    y_train=train.IndustrialP
    x_train=train.drop('IndustrialP', axis=1)
    y_test=test.IndustrialP
    x_test=test.drop('IndustrialP', axis=1)
    return x_train, x_test,y_train,y_test

一直工作到这里

# loop each station and collect train and test data 
X_train=[]
X_test=[]
Y_train=[]
Y_test=[]
for i in range(0,len(ID)):
    df=data[['ID']==ID[i]]
    x_train, x_test,y_train,y_test=train_test_split(df)
    X_train.append(x_train)
    X_test.append(x_test)
    Y_train.append(y_train)
    Y_test.append(y_test)

上面有错误。还打算运行以下代码:

# concat each train data from each station 
X_train=pd.concat(X_train)
Y_train=pd.DataFrame(pd.concat(Y_train))
# concat each test data from each station 
X_test=pd.concat(X_test)
Y_test=pd.DataFrame(pd.concat(Y_test))

任何帮助都将不胜感激。谢谢


Tags: 数据testiddatadate间隔错误train

热门问题