<ul>
<li>用<code>df.a.mean() ± df.a.std() * value</code>定义箱子边缘
<ul>
<li>下面代码中的列表将创建一个箱子边列表</李>
</ul>
</li>
<li>使用<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.mean.html" rel="nofollow noreferrer">^{<cd2>}</a>获取数据帧的平均值</li>
<li>用<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.std.html" rel="nofollow noreferrer">^{<cd3>}</a>得到平均值的标准偏差</li>
</ul>
<pre class="lang-py prettyprint-override"><code>import pandas as pd
import numpy as np # for sample data
import matplotlib.pyplot as plt
# create sample dataframe
np.random.seed(365)
data = {'a': [np.random.randint(700) for _ in range(3000)]}
df = pd.DataFrame(data)
# create the bin edges
bins = [df.a.mean() + (df.a.std() * v) for v in range(-5, 6, 1)]
print(bins)
[-652.44, -451.49, -250.55, -49.6, 151.35, 352.3, 553.25, 754.19, 955.14, 1156.09, 1357.04]
</code></pre>
<ul>
<li>若要手动对数据帧进行装箱,请<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html" rel="nofollow noreferrer">groupby</a>装箱,并生成条形图
<ul>
<li><a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.bar.html" rel="nofollow noreferrer">^{<cd4>}</a></li>
</ul>
</li>
<li>使用<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.cut.html" rel="nofollow noreferrer">^{<cd5>}</a>创建一个包含容器的新列</李>
</ul>
<pre class="lang-py prettyprint-override"><code># create a column of bins
df['bins'] = pd.cut(df.a, bins=bins)
# groupby the bins and plot
df.groupby('bins')['a'].count().plot.bar()
</code></pre>
<p><a href="https://i.stack.imgur.com/zCG9Y.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/zCG9Y.png" alt="enter image description here"/></a></p>
<ul>
<li>通过使用<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.hist.html" rel="nofollow noreferrer">^{<cd6>}</a>或<a href="https://matplotlib.org/3.3.1/api/_as_gen/matplotlib.pyplot.hist.html" rel="nofollow noreferrer">matplotlib.pyplot.hist</a>定义箱边绘制直方图</li>
<li><a href="https://matplotlib.org/3.1.1/gallery/statistics/hist.html" rel="nofollow noreferrer">matplotlib: Histograms</a></li>
</ul>
<pre class="lang-py prettyprint-override"><code># matplotlib plot
plt.hist(x=df.a, bins=bins)
plt.ylabel('Frequency')
plt.show()
# or dataframe plot
df.a.plot.hist(bins=bins)
plt.show()
</code></pre>
<p><a href="https://i.stack.imgur.com/uBZfg.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/uBZfg.png" alt="enter image description here"/></a></p>