Python PlagiarismDetector包_程序模块 - PyPI

计算在另一个文件2中找到的来自文件1的n元组的百分比

PlagiarismDetector的Python项目详细描述

计算在另一个文件2中找到的来自文件1的n元组的百分比：

from plagiarismdetector.detector import Detector

print Detector.detect(synonyms_file_path,
                       eval_file_path,
                       source_file_path,
                       n_tuples_value=3)

运行

python plagiarismdetector/main.py synonyms_file_path eval_file_path source_file_path 3

假设和概述

取决于python2.7
标记器仅适用于使用penn treebank标记器的英语文本，原因是它根据英语中可能在其他语言（如印地语）中失败的结构划分字符串，因为句子分隔符和标点符号完全不同。
模块被优化为尽可能快，一些优化是：
- 只生成和存储文件2的n个g，生成但不存储文件1的n个元组。
- 不在内存中保存生成的n个程序，将使用生成器
- n-grams字典是从file2n-grams创建的，用于file1元组的恒定时间查找
- 文件2ngram字典中的密钥包含元组的散列，而不是实际的元组，以减少空间复杂度。
- 因为我们只关心在file2中找到的file1 n元组的百分比，所以不需要存储任何元组。因此，我们首先为file2生成n-grams，然后动态计算file1的计数，而不是生成file1的所有n元组并将其与file2的元组交叉引用。

测试

python -m unittest discover tests

帮助

python plagiarismdetector/main.py -h

位置参数

^{tt4}$ Path to file to be used for synonyms
^{tt5}$ Path to file to be evaluated
^{tt6}$ Path to file to be used as source for matching
^{tt7}$ Number of N-tuples, Optional and Defaults to 3

可选参数

-h, --help show this help message and exit

示例

Returns 100.0
Evaluation File go for a run
Source File go for a jog
N-tuples 3

Returns	100.0
Evaluation File	go for a run
Source File	go for a jog
N-tuples	3

欢迎加入QQ群-->： 979659372

PlagiarismDetector 0.2.2

PlagiarismDetector的Python项目详细描述

运行

假设和概述

测试

帮助

位置参数

可选参数

示例

推荐PyPI第三方库

gbin

jupyterlab-widgets

pykern

renderwith

draw-devops-against-humanit

ckanext-semantictags

DiscordRPC.p

pythonds

rin-driver-ucs

silva.pas.openid

pyteomics

aioiliad

tikkie

ccpl

Pattern

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

^{tt4}$	Path to file to be used for synonyms
^{tt5}$	Path to file to be evaluated
^{tt6}$	Path to file to be used as source for matching
^{tt7}$	Number of N-tuples, Optional and Defaults to 3

PlagiarismDetector 0.2.2

PlagiarismDetector的Python项目详细描述

运行

假设和概述

测试

帮助

位置参数

可选参数

示例

推荐PyPI第三方库

gbin

jupyterlab-widgets

pykern

renderwith

draw-devops-against-humanit

ckanext-semantictags

DiscordRPC.p

pythonds

rin-driver-ucs

silva.pas.openid

pyteomics

aioiliad

tikkie

ccpl

Pattern

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签