简单漏斗图
funnelplot的Python项目详细描述
漏斗图
Simple funnel plots for visualising sub-group variance.
这个包使用Matplotlib在Python中提供了简单的funnel plots。这使您可以快速查看与完整总体相比,总体的子组是否是异常值。在
提供两种方法:
- parametric funnelplot它使用标准分布来估计漏斗的间隔(通常是正态分布)
- bootstrap漏斗它使用引导百分位来估计漏斗的间隔
提供了一个实用函数funnel()
,通过将Pandas数据帧分组到类似Seaborn的API中,可以方便地绘制数据。在
示例
来自^{
funnel(df=data("Caschool"), x="testscr", group="county")
安装
pip install funnelplot
示例
完整的caschool示例
^{pr2}$C:\Users\John\Dropbox\devel\funnelplot\funnelplot\core.py:14: RuntimeWarning: invalid value encountered in true_divide
return band / np.sqrt(group_size)
C:\Users\John\Dropbox\devel\funnelplot\funnelplot\core.py:14: RuntimeWarning: divide by zero encountered in true_divide
return band / np.sqrt(group_size)
# use bootstrap instead of normal fitfig,ax=plt.subplots(figsize=(5,6))ax.set_frame_on(False)funnel(df=data("Caschool"),x='testscr',group="county",bootstrap_mode=True,error_mode="bootstrap")
合成数据示例
## Synthetic dataimportnumpyasnpimportrandomrandom.seed(2020)np.random.seed(2020)groups=[]p_mean,p_std=0,1# random groups, with different sizes, means and std. devs.foriinrange(25):n_group=np.random.randint(1,80)g_std=np.random.uniform(0.1,4.5)g_mean=np.random.uniform(-1.9,0.5)groups.append(np.random.normal(p_mean+g_mean,p_std+g_std,n_group))
ax,fig=plt.subplots(figsize=(9,4))funnel_plot(groups,labels=[random.choice("abcdefg")*4foriinrange(len(groups))],percentage=95,)
ax,fig=plt.subplots(figsize=(9,4))# bootstrap version, using medians instead of meansfunnel_plot_bootstrap(groups,labels=[random.choice("abcdefg")*4foriinrange(len(groups))],percentage=95,stat=np.median)
美国石油学会
- 在
funnel(df, x, group, bootstrap_mode=False)
将数据帧df
显示为漏斗图,呈现列x
,并按group
对数据分组。在
在Parameters: df: DataFrame The data to be shown. x: string, column name The column of the frame to render as datapoints. group: string, column name The column to group the frame by bootstrap_mode: boolean, optional (default False) If True, uses the funnel_plot_bootstrap() function; otherwise use the parameteric funnel_plot() function **kwargs: passed to funnel_plot() / funnel_plot_bootstrap()
- 在
funnel_plot(data_groups, ...)
将数组列表绘制为漏斗图。在
在Parameters: data_groups: list of 1D arrays a list of 1D arrays the individual groups to be analysed. ax: axis, optional an Matplotlib axis to draw onto dist: distribution function, like scipy.stats.norm(0,1) function to use to get the ppf and cdf of for plotting percentage: float, 0.0 -> 100.0 (default 95) percentage of interval enclosed (e.g. percentage=95 will enclose 2.5% to 97.5%) labels: list of strings, optional one label string per group, will be shown only for those groups that lie outside the funnel left_color: matplotlib color, optional (default C1) color to render points to the left of the funnel bounds (negative outliers) right_color: matplotlib color, optional (default C2) color to render points to the right of the funnel bounds (positive outliers) error_mode: string, optional (default "data") For each outlier group, can show: "data": original data values for that group as a dot plot "none": no error bars "bootstrap": 95% bootstrap intervals, as lines "ci": 95% CI intervals, as lines show_rug: boolean, optional (default False): If True, show a rug plot at the bottom of the graph, for the whole group population show_contours: boolean optional (default True) true if additional contours shown
- 在
funnel_plot_bootstrap(data_groups, ...)
将数组列表绘制为漏斗图,使用引导间隔而不是参数分布。在
在Parameters: data_groups: list of 1D arrays a list of 1D arrays the individual groups to be analysed. ax: axis, optional an Matplotlib axis to draw onto percentage: float, 0.0 -> 100.0 (default 95) percentage of interval enclosed (e.g. percentage=95 will enclose 2.5% to 97.5%) labels: list of strings, optional one label string per group, will be shown only for those groups that lie outside the funnel left_color: matplotlib color, optional (default C1) color to render points to the left of the funnel bounds (negative outliers) right_color: matplotlib color, optional (default C2) color to render points to the right of the funnel bounds (positive outliers) bootstrap_n: int, optional (default 1000) number of runs in the bootstrap error_mode: string, optional (default "data") For each outlier group, can show: "data": original data values for that group as a dot plot "none": no error bars "bootstrap": 95% bootstrap intervals, as lines "ci": 95% CI intervals, as lines show_rug: boolean, optional (default False): If True, show a rug plot at the bottom of the graph, for the whole group population show_contours: boolean optional (default True) true if additional contours shown stat: function like np.mean, optional statistic to use when plotting the funnel plot
- 项目
标签: