使用具有恒定验证集大小的TimeSeriesSplit的更简单方法？

1条回答

网友

1楼 · 发布于 2024-09-28 21:18:38

您可以以这样的方式选择n_splits，以便测试集包含您想要的内容。在

我的另一个答案也采用了类似的想法这里：在

https://stackoverflow.com/a/43360172/3374996

假设您的数据有6个样本：

import numpy as np
X = np.array([1,2,3,4,5,6,7]()

# Here put the number you want in test data,
# I used 1 because your example has only 1 test data in each split
num_in_test = 1

test_size = float(num_in_test) / len(X)

n_splits = int((1//test_size)-1)

tscv = TimeSeriesSplit(n_splits=n_splits)

for train_index, test_index in tscv.split(X):
    print(X[train_index], X[test_index])

# Output
(array([1]), array([2]))
(array([1, 2]), array([3]))
(array([1, 2, 3]), array([4]))
(array([1, 2, 3, 4]), array([5]))
(array([1, 2, 3, 4, 5]), array([6]))
(array([1, 2, 3, 4, 5, 6]), array([7]))

一旦n_拆分被修复，就可以轻松地将TimeSeriesSplit对象传递给GridSearchCV或任何其他实用程序。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用具有恒定验证集大小的TimeSeriesSplit的更简单方法？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >