Python datastories包_程序模块 - PyPI

losd的数据故事模式分析

datastories的Python项目详细描述

数据存储模式库

数据故事模式库是一个具有模式分析的存储库，指定用于链接的开放统计数据。故事模式是从“数据新闻学”的文学研究中检索出来的。

安装

pipinstalldatastories

要求将随软件包一起自动安装

###导入/使用

importdatastories.analyticalaspatternspatterns.DataStoryPattern(sparqlendpointurl,jsonmetadata)

<创建的对象允许基于JSON MeATADAT提供的SPARQL端点查询。

json模板

{"cube_key":{"title":"title of cube","dataset_structure":"URI for cube structure","dimensions":{"dimension_key":{"dimension_title":"Title of diemnsion","dimension_url":"URI for dimension","dimension_prefix":"URI for dimension's values"},"dimension_key":{"dimension_title":"Title of diemnsion","dimension_url":"URI for dimension","dimension_prefix":"URI for dimension's values"}},"hierarchical_dimensions":{"dimension_key":{"dimension_title":"Title of diemnsion","dimension_url":"URI for dimension","dimension_prefix":"URI for dimension's values","dimension_levels":{"level_key":"integer(granularity level)","level_key":"integer(granularity level)"}}},"measures":{"measure_key":{"measure_title":"Title of measure","measure_url":"URI for measure"}}}}

模式描述

Measurement and Counting
League Table
Internal Comprison
Profile Outliers
Dissect Factors
Highlight Contrast
Start Big Drill Down
Start Small Zoom Out
Analysis By Category
Explore Intersection
Narrating Change Over Time

M计数

测量和计数应用于整个数据集的算术运算符-有关数据的基本信息

属性

defMCounting(self,cube="",dims=[],meas=[],hierdims=[],count_type="raw",df=pd.DataFrame())

Parameter	Type	Description
cube	^{}	Cube, which dimensions and measures will be investigated
dims	^{}	List of dimensions (from cube) to take into investigation
meas	^{}	List of measures (from cube) to take into investigation
hierdims	^{}	Hierarchical Dimesion with selected hierarchy level to take into investigation
count_type	^{}	Type of Count to perform
df	^{}	DataFrame object, if data is already retrieved from endpoint

输出

基于count_type值

Count_type	Description
raw	data without any analysis performed
sum	sum across all numeric columns
mean	mean across all numeric columns
min	minimum values from all numeric columns
max	maximum values from all numeric columns
count	amount of records

长期有效

leaguetable-排序和提取特定数量的记录

属性

defLTable(self,cube=[],dims=[],meas=[],hierdims=[],columns_to_order="",order_type="asc",number_of_records=20,df=pd.DataFrame())

Parameter	Type	Description
cube	^{}	Cube, which dimensions and measures will be investigated
dims	^{}	List of dimensions (from cube) to take into investigation
meas	^{}	List of measures (from cube) to take into investigation
hierdims	^{}	Hierarchical Dimesion with selected hierarchy level to take into investigation
columns_to_order	^{}	Set of columns to order by
order_type	^{}	Type of order (asc/desc)
number_of_records	^{}	Amount of records to retrieve
df	^{}	DataFrame object, if data is already retrieved from endpoint

输出

基于排序类型值

Sort_type	Description
asc	ascending order based on columns provided in ^{}
desc	descending order based on columns provided in ^{}

内部比较

InternalComparison-一列中与文本值相关的数值比较

属性

defInternalComparison(self,cube="",dims=[],meas=[],hierdims=[],df=pd.DataFrame(),dim_to_compare="",meas_to_compare="",comp_type="")

Parameter	Type	Description
cube	^{}	Cube, which dimensions and measures will be investigated
dims	^{}	List of dimensions (from cube) to take into investigation
meas	^{}	List of measures (from cube) to take into investigation
hierdims	^{}	Hierarchical Dimesion with selected hierarchy level to take into investigation
df	^{}	DataFrame object, if data is already retrieved from endpoint
dim_to_compare	^{}	Dimension, which values will be investigated
meas_to_compare	^{}	Measure, which numeric values related to ^{} will be processed
comp_type	^{}	Type of comparison to perform

输出

与所选的comp_type无关，输出数据将有额外的列以特定方式处理数值列meas_to_compare。

可用的比较类型comp_type

Comp_type	Description
diffmax	difference with max value related to specific textual value
diffmean	difference with arithmetic mean related to specific textual values
diffmin	difference with minimum value related to specific textual value

剖面异常值

剖面异常值-检测数据中的异常值（异常）

属性

defProfileOutliers(self,cube=[],dims=[],meas=[],hierdims=[],df=pd.DataFrame(),displayType="outliers_only")

Parameter	Type	Description
cube	^{}	Cube, which dimensions and measures will be investigated
dims	^{}	List of dimensions (from cube) to take into investigation
meas	^{}	List of measures (from cube) to take into investigation
hierdims	^{}	Hierarchical Dimesion with selected hierarchy level to take into investigation
df	^{}	DataFrame object, if data is already retrieved from endpoint
display_type	^{}	What information display are bound to display (with/without anomalies)

输出

使用python scipy库的模式分析将在数据中的异常值序列中执行快速探索。

基于display_type参数数据将显示有/无检测异常值。

可用的显示类型display_type

display_type	Description
outliers_only	returns rows from dataset where unusual values were detected
without_outliers	returns dataset with excluded rows where unusual values were detected

解剖因子

dissectfactors-根据dim_to_dissect中的值分解数据

属性

defDissectFactors(self,cube="",dims=[],meas=[],hierdims=[],df=pd.DataFrame(),dim_to_dissect="")

Parameter	Type	Description
cube	^{}	Cube, which dimensions and measures will be investigated
dims	^{}	List of dimensions (from cube) to take into investigation
meas	^{}	List of measures (from cube) to take into investigation
hierdims	^{}	Hierarchical Dimesion with selected hierarchy level to take into investigation
df	^{}	DataFrame object, if data is already retrieved from endpoint
dim_to_dissect	^{}	Based on which dimension data should be decomposed

输出

作为输出，数据将以字典的形式分解，其中每个子集的值仅与特定值相关。子数据集的字典将被构造为一系列paiers，其中每个susbet的key值将来自dim_to_dissect 这个键值将是数据，其中yhis键值发生了。

高光对比度

HighlightContrast-与一个文本列相关的值之间的部分差异

属性

defHighlightContrast(self,cube="",dims=[],meas=[],hierdims=[],df=pd.DataFrame(),dim_to_contrast="",contrast_type="",meas_to_contrast="")

Parameter	Type	Description
cube	^{}	Cube, which dimensions and measures will be investigated
dims	^{}	List of dimensions (from cube) to take into investigation
meas	^{}	List of measures (from cube) to take into investigation
hierdims	^{}	Hierarchical Dimesion with selected hierarchy level to take into investigation
df	^{}	DataFrame object, if data is already retrieved from endpoint
dim_to_contrast	^{}	Textual column, from which values will be contrasted
meas_to_contrast	^{}	Numerical column, which values are contrasted
contrast_type	^{}	Type of contrast to present

输出

独立于所选的contrast_type，输出数据将有额外的列以特定方式处理数值列meas_to_contrast。

可用的比较类型contrast_type

Contrast_type	Description
partofwhole	difference with max value related to specific textual value
partofmax	difference with arithmetic mean related to specific textual values
partofmin	difference with minimum value related to specific textual value

开始向下搜索

StartBigDrilldown-从多层次检索数据。

此模式只能应用于尚未存储在数据框中的数据

属性

defStartBigDrillDown(self,cube="",dims=[],meas=[],hierdim_drill_down=[])

Parameter	Type	Description
cube	^{}	Cube, which dimensions and measures will be investigated
dims	^{}	List of dimensions (from cube) to take into investigation
meas	^{}	List of measures (from cube) to take into investigation
hierdim_drill_down	^{}	Hierarchical dimension with list of hierarchy levels to inspect

输出

作为输出，数据将以字典的形式检索，其中每个数据集将从不同的层次结构级别检索。列表将在hierdim_drill_down中提供。参数中提供的层次结构级别将根据提供的元数据自动按从最一般到最详细的顺序排序。

启动mallzoomout

startsmallzoomout-从多个层次的数据检索。

此模式只能应用于尚未存储在数据框中的数据

属性

defStartSmallZoomOut(self,cube="",dims=[],meas=[],hierdim_zoom_out=[])

Parameter	Type	Description
cube	^{}	Cube, which dimensions and measures will be investigated
dims	^{}	List of dimensions (from cube) to take into investigation
meas	^{}	List of measures (from cube) to take into investigation
hierdim_zoom_out	^{}	Hierarchical dimension with list of hierarchy levels to inspect

输出

作为输出，数据将以字典的形式检索，其中每个数据集将从不同的层次结构级别检索。列表将在hierdim_zoom_out中提供。参数中提供的层次结构级别将根据提供的元数据自动按从最详细到最一般的级别进行排序。

按类别分析

AnalysisByCategory—根据Dim_for_类别中的值对数据进行组合，并对每个susbet执行分析

属性

defAnalysisByCategory(self,cube="",dims=[],meas=[],hierdims=[],df=pd.DataFrame(),dim_for_category="",meas_to_analyse="",analysis_type="min"):

Parameter	Type	Description
cube	^{}	Cube, which dimensions and measures will be investigated
dims	^{}	List of dimensions (from cube) to take into investigation
meas	^{}	List of measures (from cube) to take into investigation
hierdims	^{}	Hierarchical Dimesion with selected hierarchy level to take into investigation
df	^{}	DataFrame object, if data is already retrieved from endpoint
dim_for_category	^{}	Dimension, based on which input data will be categorised
meas_to_analyse	^{}	Measure, which will be analysed
analysis_type	^{}	Type of analysis to perform

输出

作为输出，数据将以字典的形式分解，其中每个子集的值仅与特定值相关。该子集将基于analysis_type参数进行分析

可用的分析类型analysis_type

Analysis_type	Description
min	Minimum per each category
max	Maximum per each category
mean	Arithmetical mean per each category
sum	Total value from each category

explore接口

属性

defExploreIntersection(self,dim_to_explore=""):

Parameter	Type	Description
dim_to_explore	^{}	Dimension, which existence within enpoint is going to be investigated

输出

模式将返回一系列数据集，其中每个数据集表示一个多维数据集中dim_to_explore的出现

叙述更改超时

属性

defNarrChangeOT(self,cube="",dims=[],meas=[],hierdims=[],df=pd.DataFrame(),meas_to_narrate="",narr_type="")

Parameter	Type	Description
cube	^{}	Cube, which dimensions and measures will be investigated
dims	^{}	List of dimensions (from cube) to take into investigation
meas	^{}	List of measures (from cube) to take into investigation
hierdims	^{}	Hierarchical Dimesion with selected hierarchy level to take into investigation
df	^{}	DataFrame object, if data is already retrieved from endpoint
meas_to_narrate	^{}	Set of 2 measures, which change will be narrated
narr_type	^{}	Type of narration to perform

输出

与所选的narr_type无关，输出数据将有额外的列，其中的数值将以特定方式处理。

可用的分析类型narr_type

Narr_type	Description
percchange	Percentage change between first nad second property
diffchange	Quantitive change between first and second property

欢迎加入QQ群-->： 979659372

datastories 0.3.11

datastories的Python项目详细描述

数据存储模式库

安装

json模板

模式描述

M计数

属性

输出

长期有效

属性

输出

内部比较

属性

输出

剖面异常值

属性

输出

解剖因子

属性

输出

高光对比度

属性

输出

开始向下搜索

属性

输出

启动mallzoomout

属性

输出

按类别分析

属性

输出

explore接口

属性

输出

叙述更改超时

属性

输出

推荐PyPI第三方库

drf-async-ws

xhmonitor

django-http-etag-view

onepassword

conductor.maya

rjieba

odoo11-addon-sale-margin-sync

gqlbff-postgres

utils4ymc

tmg-data

pnmap

pyfreya

haste-tf

MacSesh

nmc-met-graphics

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签