将数据帧子集以与fmin一起使用会产生意外错误

2024-09-27 09:35:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我目前正在使用fmin()尝试将一个公式拟合到我的数据中。文件中的数据只是一个两列的浮动列表。这是我的密码:

filename = 'HAB_30_Master_Overall_no2100'
data = pd.read_csv(filename+'.csv', header=0, usecols=['Wavelength', '2.5'])

def fitFunc(x): 
    global B, A, data, sumresids
    wave = data['Wavelength']
    modelforfit = x[0]*wave**-x[1]
    data['model'] = modelforfit
    data['Residuals'] = abs(data['2.5'] - data['model'])
    sumresids = data['Residuals'].sum()
    return sumresids

def fitData():
    global xopt
    B = 2
    A = 1
    x0 = np.array([B, A])
    xopt, fopt, iter, funcalls, warnflag = fmin(fitFunc,x0,maxiter = 10000, full_output=True, disp=False) 
print xopt[0], xopt[1]

fitFunc(data['Wavelength'])
fitData()

当我使用文件中的所有值时,这段代码就起作用了。不过,我要做的是对数据帧进行子集划分,以便在只包含一些数据点时可以看到拟合是如何变化的。如果我唯一更改的是将nrows=10添加到read\u csv调用,即使有>;文件中有90行,我得到错误:

ValueError: Integers to negative integer powers are not allowed.

如果我尝试使用.iloc创建一个新的dataframe来子集行,如下所示:

filename = 'HAB_30_Master_Overall_no2100'
data = pd.read_csv(filename+'.csv', header=0, usecols=['Wavelength', '2.5'])
newdata = data.iloc[:10]

def fitFunc(x): 
    global B, A, data, sumresids
    wave = newdata['Wavelength']
    modelforfit = x[0]*wave**-x[1]
    newdata['model'] = modelforfit
    newdata['Residuals'] = abs(newdata['2.5'] - newdata['model'])
    sumresids = newdata['Residuals'].sum()
    return sumresids

def fitData():
    global xopt
    B = 2
    A = 1
    x0 = np.array([B, A])
    xopt, fopt, iter, funcalls, warnflag = fmin(fitFunc,x0,maxiter = 10000, full_output=True, disp=False) 
print xopt[0], xopt[1]

fitFunc(newdata['Wavelength'])
fitData()

我得到这样的警告:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
newdata['model'] = modelforfit
/var/folders/j8/1fzjf9cj3slcmyy1t89sth5w0000gp/T/tmpZRuvLX.py:20: 
SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  newdata['Residuals'] = abs(newdata['2.5'] - newdata['model']) 

它崩溃了。理想情况下,我也希望能够使用非连续的行,比如前5行和后5行,但我甚至愿意一次只使用一个块。如果有人能告诉我为什么以上两种方法都不起作用,并提供一个解决方案,那将是非常有帮助的

编辑:This is a snippet of what my data looks like. 为了弄清楚这一点,我刚刚在read\u csv调用中导入了一列,但最终的目标是让它在这个更大文件的段中循环,要么用一个子集行(问题)要么逐列(已经弄清楚)


Tags: 文件csv数据readdatamodeldeffilename

热门问题