getting ValueError:“只能用多索引对索引进行元组”

2024-10-01 15:49:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图实现一个简单的RNN来预测整数序列中的下一个整数。所以,我有一个数据集如下:

Id  Sequence
1   1,0,0,2,24,552,21280,103760,70299264,5792853248,587159944704
2   1,1,5,11,35,93,269,747,2115,5933,16717,47003,132291,372157,1047181,2946251,8289731,23323853,65624397,184640891,519507267,1461688413,4112616845,11571284395,32557042499,91602704493,257733967693
4   0,1,101,2,15,102,73,3,40,16,47,103,51,74,116,4,57,41,125,17,12,48,9,104,30,52,141,75,107,117,69,5,148,58,88,42,33,126,152,18,160,13,38,49,55,10,28,105,146,31,158
5   1,4,14,23,42,33,35,34,63,66,87,116,84,101,126,164,128,102,135,143,149,155,203,224,186,204,210,237,261,218,219,286,257,266,361,355,336,302,374,339,371,398,340,409,348,388,494,436,407,406
6   1,1,2,5,4,2,6,13,11,4,10,10,12,6,8,29,16,11,18,20,12,10,22,26,29,12,38,30,28,8,30,61,20,16,24,55,36,18,24,52,40,12,42,50,44,22,46,58,55,29,32,60,52,38,40,78,36,28,58,40,60,30,66,125,48,20,66,80,44,24
9   0,31,59,90,120,151,181,212,243,273,304,334,365,396,424,455,485,516,546,577,608,638,669,699,730,761,789,820,850,881,911,942,973,1003,1034,1064,1095,1126,1155,1186,1216,1247,1277,1308,1339,1369,1400,1430
10  1,1,2,5,13,36,111,347,1134,3832,13126,46281,165283,598401,2202404,8168642,30653724,116082962,442503542,1701654889,6580937039,25603715395,100223117080,394001755683,1556876401398,6178202068457,24608353860698,98421159688268,394901524823138,1589722790850089
12  0,0,0,0,112,40286,5485032,534844548,45066853496,3538771308282,267882021563464,19861835713621616,1453175611052688600,105278656040052332838,7564280930105061931496

到目前为止,我的代码是:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import math
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import SimpleRNN
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from keras.preprocessing.sequence import pad_sequences

def stoarray(data = [], sep = ','):
    return data.map(lambda x: np.array(x.split(sep), dtype=float))

def create_dataset(dataset, window_size=1):
    dataX, dataY = [], []
    for i in range(len(dataset)-window_size-1):
        a = dataset[i:(i+window_size), 0]
        dataX.append(a)
        dataY.append(dataset[i + window_size, 0]) #gives the ValueError : Can only tuple index with multi index
    return np.array(dataX), np.array(dataY)

# fix random seed for reproducibility
np.random.seed(7)

# loading data
colna = ['id', 'seq']
train_data = pd.read_csv('G:/Python/integer_sequencing/testfile.csv', header=1)
train_data.columns = colna
dataset = train_data['seq']
#print(dataset)
window_size = 1
X_train, Y_train = create_dataset(dataset, window_size)

print(X_train.head(5))
print(Y_train.head(5))

我试着用X_列作为输入来分割每一个序列,除了最后一个项外,X_列是一个完整的集合,Y_列是一个输出,只包含最后一个数字。 但是当我运行代码时,我得到了ValueError:只能用MultiIndex对索引进行元组化。 有谁能解释一下它对我的代码意味着什么,我必须做什么来解决它。

回溯调用: enter image description here

我是新来的堆栈溢出和深入学习,所以我会非常感谢,如果你能建议和帮助我格式化我的问题。


Tags: 代码fromimportdatasizeasnptrain
1条回答
网友
1楼 · 发布于 2024-10-01 15:49:22

dataset[i:(i+window_size), 0]中,i:(i+window_size), 0有问题。

在您的代码中,dataset表示train_data['seq'],这是单列-单维Series-但是您使用i:(i+window_size), 0类似于二维DataFrame

只能使用单个整数,如dataset[0]或切片dataset[i:(i+window_size)]

相关问题 更多 >

    热门问题