图论散射图诊断

pyscagnostics的Python项目详细描述


脓皮病

用于计算图论散点图诊断的Python包装器。在

Scagnostics describe various measures of interest for pairs of variables, based on their appearance on a scatterplot. They are useful tool for discovering interesting or unusual scatterplots from a scatterplot matrix, without having to look at every individual plot.

Wilkinson L., Anand, A., and Grossman, R. (2006). High-Dimensional visual analytics: Interactive exploration guided by pairwise views of point distributions. IEEE Transactions on Visualization and Computer Graphics, November/December 2006 (Vol. 12, No. 6) pp. 1363-1372.

安装

pip install pyscagnostics

使用

^{pr2}$

文件

defscagnostics(*args,bins:int=50,remove_outliers:bool=True)->Tuple[dict,np.ndarray]:"""Scatterplot diagnostic (scagnostic) measures    Scagnostics describe various measures of interest for pairs of variables,    based on their appearance on a scatterplot.  They are useful tool for    discovering interesting or unusual scatterplots from a scatterplot matrix,    without having to look at every individual plot.    Example:        `scagnostics` can take an x, y pair of iterables (e.g. lists or NumPy arrays):        ```            from pyscagnostics import scagnostics            import numpy as np            # Simulate data for example            x = np.random.uniform(0, 1, 100)            y = np.random.uniform(0, 1, 100)            measures, bins = scagnostics(x, y)        ```        A Pandas DataFrame can also be passed as the singular required argument. The        output will be a generator of results:        ```            from pyscagnostics import scagnostics            import numpy as np            import pandas as pd            # Simulate data for example            x = np.random.uniform(0, 1, 100)            y = np.random.uniform(0, 1, 100)            z = np.random.uniform(0, 1, 100)            df = pd.DataFrame({                'x': x,                'y': y,                'z': z            })            results = scagnostics(df)            for x, y, result in results:                measures, bins = result                print(measures)        ```    Args:        *args:            x, y: Lists or numpy arrays            df: A Pandas DataFrame        bins: Max number of bins for the hexagonal grid axis            The data are internally binned starting with a (bins x bins) hexagonal grid            and re-binned with smaller bin sizes until less than 250 empty bins remain.        remove_outliers: If True, will remove outliers before calculations    Returns:        (measures, bins)            measures is a dict with scores for each of 9 scagnostic measures.                See pyscagnostics.measure_names for a list of measures            bins is a 3 x n numpy array of x-coordinates, y-coordinates, and                counts for the hex-bin grid. The x and y coordinates are re-scaled                between 0 and 1000. This is returned for debugging and inspection purposes.        If the input is a DataFrame, the output will be a generator yielding a tuples of        scagnostic results for each column pair:            (x, y, (measures, bins))    """

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
Java:删除并重新创建对象   HttpObjectAggregator上的java Netty 4泄漏异常   即使测试失败,java Gradle也会在测试阶段后执行任务   java更新JComboBox后,如何刷新框的长度   java当我单击按钮时,我的应用程序意外停止   java SpringBoot 2.2.1 groovyMarkupConfigurer异常   java spring webflux:如何从同步调用发布事件以进行异步处理?   java Viewpager“ViewGroup”更改背景色运行时   JavaJTree:检查选择的级别   java我想在所有网站href链接上添加前缀   java如何生成无分支代码?   用Java在DrawingCanvas上创建线条   使用jpos api的java打包子字段   Java映射到对象而不是另一个对象的现有方法可选<>吗?   java添加更改图形颜色的按钮