具有共享X轴的子批次中对数正态分布的直方图拟合

2024-06-28 23:58:13 发布

您现在位置:Python中文网/ 问答频道 /正文

我有三个不同长度的数组,比如standxstandy和{},它们只包含正值。 我想用与this plot相似的方式绘制它们的直方图分布,也就是说,共享x轴(另请参阅下面编辑后的绘图)。 但我希望x轴的比例是log,三个图中的箱子大小相同(后一个条件暂时可以放宽)。在

然后我想用log空间中的高斯函数拟合这些分布(即对数正态分布)。不知怎么的,我总是把拟合搞得一团糟,而高斯函数真的不能再现分布(它通常比实际分布平坦得多,或者其他奇怪的行为)。在

上次更新 以下是我设法得到的结果:拟合曲线并不像预期的那样进行

import pyfits
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from scipy.optimize import curve_fit
import pylab as py

def gaussian(x, a, mean, sigma):
    return a * np.exp(-((x - mean)**2 / (2 * sigma**2)))

f, (ax1, ax2, ax3) = plt.subplots(3, sharex=True)
bins = np.histogram(standx, bins = 100)[1]

num_1, bins_1  = np.histogram(standx, np.histogram(standx, bins = 100)[1])
bins_01 = np.logspace( np.log10( standx.min()  ), np.log10(standx.max() ), 100 )
x_fit = py.linspace(bins_01[0], bins_01[-1], 100)
popt, pcov = curve_fit(gaussian, x_fit, num_1, p0=[1, np.mean(standx), np.std(standx)])
y_fit = gaussian(bins_01, *popt)
counts, edges, patches = ax1.hist(standx, bins_01, facecolor='blue', alpha=0.5) # bins=100
area = sum(np.diff(edges)*counts)

# calculate length of each bin (required for scaling PDF to histogram)
bins_log_len = np.zeros( x_fit.size )
for ii in range( counts.size):
    bins_log_len[ii] = edges[ii+1]-edges[ii]

# Create an array of length num_bins containing the center of each bin.
centers = 0.5*(edges[:-1] + edges[1:])
# Make a fit to the samples.
shape, loc, scale = stats.lognorm.fit(standx, floc=0)
# get pdf-values for same intervals as histogram
samples_fit_log = stats.lognorm.pdf( bins_01, shape, loc=loc, scale=scale )
# oplot fitted and scaled PDF into histogram

new_x = np.linspace(np.min(standx), np.max(standx), 100)
pdf = stats.norm.pdf(new_x, loc=np.log(scale), scale=shape)
ax1.plot(new_x, pdf*sum(counts), 'k-')
ax1.plot(bins_01, np.multiply(samples_fit_log,    bins_log_len)*sum(counts), 'g--', label='PDF using histogram bins', linewidth=2 )
ax1.set_xscale('log')
ax1.plot(x_fit, stats.norm.pdf(x_fit, popt[1], popt[2])*area,'r--',linewidth=2,label='Fit: $\mu$=%.3f , $\sigma$=%.3f'%(popt[1],popt[2]) )
ax1.legend(loc='best', frameon=False, prop={'size':15})

# And similar for the ax2, ax3 plots

下面是结果图: enter image description here

即使在直方图分布为零的情况下,顶部图中拟合的高斯左翼也会提升到零以上。我做错什么了?在

编辑2:下面是一个复制图中顶部图的数据示例。在

^{pr2}$

Tags: importlogpdfstatsnplocfithistogram
1条回答
网友
1楼 · 发布于 2024-06-28 23:58:13

我想你想要适应原木转换的箱子,并通过在每个箱子上乘以你上面所做的来修正缩放。在

enter image description here

def gaussian(x, a, mean, sigma):
    return a * np.exp(-((x - mean)**2 / (2 * sigma**2)))

f, (ax1, ax2, ax3) = plt.subplots(3, sharex=True)
bins = np.histogram(standx, bins = 100)[1]

from scipy.optimize import curve_fit
from scipy import stats


num_1, bins_1  = np.histogram(standx, np.histogram(standx, bins = 100)[1])

#log transform the bins!
bins_log=np.log10(bins_1[:-1])
bins_01 = np.logspace( np.log10( standx.min()  ), np.log10(standx.max() ), 100 )
x_fit = np.linspace(bins_01[0], bins_01[-1], 100)

#popt, pcov = curve_fit(gaussian, x_fit, num_1, p0=[1, np.mean(standx), np.std(standx)])
popt, pcov = curve_fit(gaussian, bins_log, num_1, p0=[1, np.mean(standx), np.std(standx)])

#y_fit = gaussian(bins_01, *popt)
y_fit = gaussian(bins_log, *popt)
counts, edges, patches = ax1.hist(standx, bins_01, facecolor='blue', alpha=0.5) # bins=100
area = sum(np.diff(edges)*counts)

# calculate length of each bin (required for scaling PDF to histogram)
bins_log_len = np.zeros( x_fit.size )
for ii in range( counts.size):
    bins_log_len[ii] = edges[ii+1]-edges[ii]

# Create an array of length num_bins containing the center of each bin.
centers = 0.5*(edges[:-1] + edges[1:])
# Make a fit to the samples.
shape, loc, scale = stats.lognorm.fit(standx, floc=0)
# get pdf-values for same intervals as histogram
samples_fit_log = stats.lognorm.pdf( bins_01, shape, loc=loc, scale=scale )
# oplot fitted and scaled PDF into histogram


new_x = np.linspace(np.min(standx), np.max(standx), 100)
pdf = stats.norm.pdf(new_x, loc=np.log(scale), scale=shape)
ax1.plot(new_x, pdf*sum(counts), 'k-')
ax1.plot(bins_01, np.multiply(samples_fit_log,    bins_log_len)*sum(counts), 'g ', label='PDF using histogram bins', linewidth=2 )

#ax1.plot(x_fit, stats.norm.pdf(x_fit, popt[1], popt[2])*area,'r ',linewidth=2,label='Fit: $\mu$=%.3f , $\sigma$=%.3f'%(popt[1],popt[2]) )

log_adjusted_pdf=np.multiply(bins_log_len,stats.norm.pdf(bins_log, popt[1], popt[2]))
scale_factor=len(standx)/sum(log_adjusted_pdf)
ax1.plot(bins_1[:-1], scale_factor*log_adjusted_pdf,'r ',linewidth=2,label='Fit: $\mu$=%.3f , $\sigma$=%.3f'%(popt[1],popt[2]) )
ax1.set_xscale('log')
ax1.legend(loc='best', frameon=False, prop={'size':15})

# And similar for the ax2, ax3 plots

相关问题 更多 >