matplotlib如何计算直方图的密度

2024-09-24 22:28:54 发布

您现在位置:Python中文网/ 问答频道 /正文

阅读matplotlib plt.hist文档,有一个密度参数可以设置为true

density : bool, optional
            If ``True``, the first element of the return tuple will
            be the counts normalized to form a probability density, i.e.,
            the area (or integral) under the histogram will sum to 1.
            This is achieved by dividing the count by the number of
            observations times the bin width and not dividing by the total
            number of observations. If *stacked* is also ``True``, the sum of
            the histograms is normalized to 1.

This is achieved by dividing the count by the number of observations times the bin width and not dividing by the total number of observations

我试着用样本数据复制这个

**Using matplotlib inbuilt calculations** .

ser = pd.Series(np.random.normal(size=1000))
ser.hist(density = 1,  bins=100)

**Manual calculation of the density** : 

arr_hist , edges = np.histogram( ser, bins =100)
samp = arr_hist / ser.shape[0] * np.diff(edges)
plt.bar(edges[0:-1] , samp )
plt.grid()

这两个图在y轴刻度上完全不同,有人能指出到底出了什么问题,以及如何手动复制密度计算吗


Tags: ofthetonumberbymatplotlibisnp
1条回答
网友
1楼 · 发布于 2024-09-24 22:28:54

这是语言中的歧义。判决

This is achieved by dividing the count by the number of observations times the bin width

需要像这样读

This is achieved by dividing (the count) by (the number of observations times the bin width)

count / (number of observations * bin width)

完整代码:

import numpy as np
import matplotlib.pyplot as plt

arr = np.random.normal(size=1000)

fig, (ax1, ax2) = plt.subplots(2)
ax1.hist(arr, density = True,  bins=100)
ax1.grid()


arr_hist , edges = np.histogram(arr, bins =100)
samp = arr_hist / (arr.shape[0] * np.diff(edges))
ax2.bar(edges[0:-1] , samp, width=np.diff(edges) )
ax2.grid()

plt.show()

enter image description here

相关问题 更多 >