Python tkseem包_程序模块 - PyPI

未提供项目说明

tkseem的Python项目详细描述

^{1}$ 在

tkseem（تقيم）是一个标记化库，它封装了阿拉伯语文本的标记化和预处理的不同方法。在

文件

有关完整文档，请访问readthedocs。在

安装

pip install tkseem

使用

标记化

在

^{pr2}$

缓存

tokenizer.tokenize(open('data/raw/train.txt').read(),use_cache=True)

保存并加载

importtkseemastktokenizer=tk.WordTokenizer()tokenizer.train('samples/data.txt')# save the modeltokenizer.save_model('vocab.pl')# load the modeltokenizer=tk.WordTokenizer()tokenizer.load_model('vocab.pl')

模型不可知

importtkseemastkimporttimeimportseabornassnsimportpandasaspddefcalc_time(fun):start_time=time.time()fun().train()returntime.time()-start_timerunning_times={}running_times['Word']=calc_time(tk.WordTokenizer)running_times['SP']=calc_time(tk.SentencePieceTokenizer)running_times['Random']=calc_time(tk.RandomTokenizer)running_times['Disjoint']=calc_time(tk.DisjointLetterTokenizer)running_times['Char']=calc_time(tk.CharacterTokenizer)

笔记本电脑

我们展示了如何使用tkseem来训练一些nlp模型。在

Name	Description	Notebook
Demo	Explain the syntax of all tokenizers.
Sentiment Classification	WordTokenizer for processing sentences and then train a classifier for sentiment classification.
Meter Classification	CharacterTokenizer for meter classification using bidirectional GRUs.
Translation	Seq-to-seq model with attention.

欢迎加入QQ群-->： 979659372

tkseem 0.0.3

tkseem的Python项目详细描述

文件

安装

使用

标记化

缓存

保存并加载

模型不可知

笔记本电脑

推荐PyPI第三方库

specular

gpiosvr

chetc

simpleyoutubedata

skfeature-chappers

lop

polecat-auth

syrahsearch

unwrap-labels

baizhanSuperMath3

epdif

hathor-processing

joseph-http-test

django-generic-counter

xstatictv4

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

tkseem 0.0.3

tkseem的Python项目详细描述

文件

安装

使用

标记化

缓存

保存并加载

模型不可知

笔记本电脑

推荐PyPI第三方库

specular

gpiosvr

chetc

simpleyoutubedata

skfeature-chappers

lop

polecat-auth

syrahsearch

unwrap-labels

baizhanSuperMath3

epdif

hathor-processing

joseph-http-test

django-generic-counter

xstatictv4

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签