Python pycspade包_程序模块 - PyPI

间隙蟒蛇的实现

pycspade的Python项目详细描述

皮斯帕德

这是什么？

这是一个Python包装器，用于C++ Mohammed J. Zaki算法的C++实现，作者原始代码是从http://www.cs.rpi.edu/~zaki/www-new/pmwiki.php/Software/Software#toc11下载的因为这只是一个包装器，它和C++代码

一样快。如何安装？

与Python2和3兼容。在Windows上，还需要Visual Studio 2015生成工具。

pip install Cython pycspade

如何使用？

您的数据需要采用与以下类似的特定格式：

1 1 3 8 37 42
1 2 4 4 11 37 42
2 1 2 10 73
2 2 1 72
2 3 3 4 24 77
...

第一个数字是序列索引，第二个是事件索引，第三个是元素数，后跟元素，空格分隔

让我们调用这个文件data.txt。您将按以下方式致电cspace：

frompycspade.helpersimportspade,print_result# To get raw SPADE outputresult=spade(filename='tests/zaki.txt',support=0.3,parse=False)print(result['mined'])

1 -- 442 -- 444 -- 226 -- 444 -> 6 -- 224 -> 2 -- 222 -> 1 -- 224 -> 1 -- 226 -> 1 -- 224 -> 6 -> 1 -- 224 -> 2 -> 1 -- 22

print(result['logger'])

CONF 492.7 2.5
args.MINSUPPORT 24
MINMAX 141 SUPP 42 SUPP 44 SUPP 26 SUPP 4
numfreq 4 :   SUMSUP SUMDIFF=00
EXTRARYSZ 2465792
OPENED /tmp/cspade-WWv9bQWBYdDyH85T.idx
OFF 938
Wrote Offt 
BOUNDS 15
WROTE INVERT 
Cleaned up successful: /tmp/cspade-WWv9bQWBYdDyH85T.tpose
Cleaned up successful: /tmp/cspade-WWv9bQWBYdDyH85T.idx
Cleaned up successful: /tmp/cspade-WWv9bQWBYdDyH85T.data
Cleaned up successful: /tmp/cspade-WWv9bQWBYdDyH85T.conf

print(result['summary'])

CONF 492.5 2.7 10140.781025 4
TPOSE SEQ NOF2 /tmp/cspade-WWv9bQWBYdDyH85T.data 0.3 421F1stats=[400]
SPADE /tmp/cspade-WWv9bQWBYdDyH85T.tpose 0.3 2700000 -1 1100100452000000000000000000000000000000000000000000000000000000000000000000000000000000

# To also get other sequence mining's measures, incl. lift, support, confidence:result=spade(filename='tests/zaki.txt',support=0.3,parse=True)# Pretty print result:print_result(result)

   Occurs     Accum   Support    Confid      Lift          Sequence
        4141.0000000       N/A       N/A               (1)461.0000000       N/A       N/A               (2)240.5000000 0.5000000 0.5000000          (2)->(1)220.5000000       N/A       N/A               (4)220.5000000 1.0000000 1.0000000          (4)->(1)220.5000000 1.0000000 1.0000000          (4)->(2)220.5000000 1.0000000 1.0000000     (4)->(2)->(1)220.5000000 1.0000000 1.0000000          (4)->(6)220.5000000 1.0000000 1.0000000     (4)->(6)->(1)461.0000000       N/A       N/A               (6)240.5000000 0.5000000 0.5000000          (6)->(1)

您可以向cspace提供序列列表，而不是文件：

data=[[1,10,[3,4]],[1,15,[1,2,3]],[1,20,[1,2,6]],[1,25,[1,3,4,6]],[2,15,[1,2,6]],[2,20,[5]],[3,10,[1,2,6]],[4,10,[4,7,8]],[4,20,[2,6]],[4,25,[1,7,8]]]result=spade(data=data,support=0.01)print_result(result)

结果seq是一个字符串，它有多行，如下所示：

22 80 -> 72 -> 42 -> 22 -- 2 2
22 -> 45 71 -> 42 -- 1 1
80 -> 45 71 -> 42 -- 1 1
22 80 -> 45 71 -> 42 -- 1 1

让我们破译第一行：

2280 -> 72 -> 42 -> 22 -- 22

它给出了紧跟着支持的频繁序列（最后两个数字，在这个应用程序中是相同的）。行的内容是：itemset（2280）后跟（72）后跟（42）后跟（22）。

有很多参数可以传递给这个函数。最重要的是：

support：这是最低支持级别，默认为0（不排除任何内容）
max_gap：序列中可跳过的最大项集数
min_gap：在序列中必须跳过的项集的最小数目

读取原始文件和C++实现以获得更多的细节

如何贡献？

转移此回购
进行更改
拉取请求

如何重新编译以在IDE中使用？

rm cspade.cpp; python setup.py build_ext --inplace

pycspade 0.6.2

pycspade的Python项目详细描述

皮斯帕德

如何使用？

您可以向cspace提供序列列表，而不是文件：

如何重新编译以在IDE中使用？

许可证
麻省理工学院
标签：
support
data
蟒蛇
序列
result
tmp
间隙
欢迎加入QQ群-->： 979659372

推荐PyPI第三方库

ebp

Docassemble-Pattern

discospam

pipreview

pipPractice

opencensusextzipkin

nuscenesdevkit

scepia

pyscan

anhengtimuflag

djangoresetmigrations

djangoauthadfs

beforeafter

napari-tracking

waws

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

pycspade 0.6.2

pycspade的Python项目详细描述

皮斯帕德

如何使用？

您可以向cspace提供序列列表，而不是文件：

如何重新编译以在IDE中使用？

许可证 麻省理工学院标签：supportdata蟒蛇序列resulttmp间隙欢迎加入QQ群-->： 979659372

推荐PyPI第三方库

ebp

Docassemble-Pattern

discospam

pipreview

pipPractice

opencensusextzipkin

nuscenesdevkit

scepia

pyscan

anhengtimuflag

djangoresetmigrations

djangoauthadfs

beforeafter

napari-tracking

waws

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

许可证
麻省理工学院
标签：
support
data
蟒蛇
序列
result
tmp
间隙
欢迎加入QQ群-->： 979659372

导航栏

项目链接

标签