Python ipap包_程序模块 - PyPI

ipapy是一个python模块，用于处理ipa字符串

ipap的Python项目详细描述

ipapy是一个使用国际拼音的python模块字母表（IPA）字符串。

版本：0.0.9
日期：2019-05-05
显影剂：Alberto Pettarin
许可证：MIT许可证（MIT）
联系人：click here
链接：GitHub- PyPI

安装

$ pip install ipapy

或

$ git clone https://github.com/pettarin/ipapy.git
$ cd ipapy

用法

作为python模块

############ IMPORTS ############fromipapyimportUNICODE_TO_IPAfromipapyimportis_valid_ipafromipapy.ipacharimportIPAConsonantfromipapy.ipacharimportIPAVowelfromipapy.ipastringimportIPAString############ IPAChar ############# Def.: an IPAChar is an IPA letter or diacritic/suprasegmental/tone mark# create IPAChar from its Unicode representationc1=UNICODE_TO_IPA[u"a"]# vowel open front unroundedc2=UNICODE_TO_IPA[u"e"]# vowel close-mid front unroundedc3=UNICODE_TO_IPA[u"\u03B2"]# consonant voiced bilabial non-sibilant-fricativetS1=UNICODE_TO_IPA[u"t͡ʃ"]# consonant voiceless palato-alveolar sibilant-affricatetS2=UNICODE_TO_IPA[u"t͜ʃ"]# consonant voiceless palato-alveolar sibilant-affricatetS3=UNICODE_TO_IPA[u"tʃ"]# consonant voiceless palato-alveolar sibilant-affricatetS4=UNICODE_TO_IPA[u"ʧ"]# consonant voiceless palato-alveolar sibilant-affricatetS5=UNICODE_TO_IPA[u"\u0074\u0361\u0283"]# consonant voiceless palato-alveolar sibilant-affricatetS6=UNICODE_TO_IPA[u"\u0074\u035C\u0283"]# consonant voiceless palato-alveolar sibilant-affricatetS7=UNICODE_TO_IPA[u"\u0074\u0283"]# consonant voiceless palato-alveolar sibilant-affricatetS8=UNICODE_TO_IPA[u"\u02A7"]# consonant voiceless palato-alveolar sibilant-affricatec1==c2# Falsec1==c3# Falsec1==tS1# FalsetS1==tS2# True (they both point to the same IPAChar object)tS1==tS3# True (idem)tS1==tS4# True (idem)tS1==tS5# True (idem)tS1==tS6# True (idem)tS1==tS7# True (idem)tS1==tS8# True (idem)# create custom IPACharsmy_a1=IPAVowel(name="my_a_1",descriptors=u"open front unrounded",unicode_repr=u"a")my_a2=IPAVowel(name="my_a_2",descriptors=[u"open","front","unrounded"],unicode_repr=u"a")my_a3=IPAVowel(name="my_a_3",height=u"open",backness=u"front",roundness=u"unrounded",unicode_repr=u"a")my_a4=IPAVowel(name="my_a_4",descriptors=[u"low",u"fnt","unr"],unicode_repr=u"a")my_ee=IPAVowel(name="my_e_1",descriptors=u"close-mid front unrounded",unicode_repr=u"e")my_b1=IPAConsonant(name="bilabial fricative",descriptors=u"voiced bilabial non-sibilant-fricative",unicode_repr=u"\u03B2")my_b2=IPAConsonant(name="bf",voicing=u"voiced",place=u"bilabial",manner=u"non-sibilant-fricative",unicode_repr=u"\u03B2")my_tS=IPAConsonant(name="tS",voicing=u"voiceless",place=u"palato-alveolar",manner=u"sibilant-affricate",unicode_repr=u"t͡ʃ")my_a1==my_a2# False (two different objects)my_a1==c1# False (two different objects)my_a1==UNICODE_TO_IPA["a"]# False (two different objects)# associate non-standard Unicode representationmy_aa=IPAVowel(name="a special",descriptors=[u"low",u"fnt",u"unr"],unicode_repr=u"a{*}")print(my_aa)# "a{*}"# equality vs. equivalencemy_tS==tS1# False (my_tS is a different object than tS1)my_tS.is_equivalent(tS1)# True  (my_tS is equivalent to tS1...)tS1.is_equivalent(my_tS)# True  (... and vice versa)# compare IPAChar objectsmy_a1.is_equivalent(my_a2)# Truemy_a1.is_equivalent(my_a3)# Truemy_a1.is_equivalent(my_a4)# Truemy_a1.is_equivalent(my_ee)# Falsemy_a1.is_equivalent(my_b1)# Falsemy_b1.is_equivalent(my_b2)# Truemy_b1.is_equivalent(my_tS)# False# compare IPAChar and a Unicode stringmy_b1.is_equivalent(u"\u03B2")# Truemy_b1.is_equivalent(u"β")# Truemy_b1.is_equivalent(u"b")# Falsemy_tS.is_equivalent(u"tS")# Falsemy_tS.is_equivalent(u"tʃ")# False (missing the combining diacritic)my_tS.is_equivalent(u"t͡ʃ")# True (has combining diacritic)# compare IPAChar and a string listing descriptorsmy_a1.is_equivalent(u"open front unrounded")# False (missing 'vowel')my_a1.is_equivalent(u"open front unrounded vowel")# Truemy_a1.is_equivalent(u"low fnt unr vwl")# True (known abbreviations are good as well)my_ee.is_equivalent(u"open front unrounded vowel")# Falsemy_b1.is_equivalent(u"voiced bilabial non-sibilant-fricative")# False (missing 'consonant')my_b1.is_equivalent(u"voiced bilabial non-sibilant-fricative consonant")# Truemy_b1.is_equivalent(u"consonant non-sibilant-fricative bilabial voiced")# True (the order does not matter)my_b1.is_equivalent(u"consonant non-sibilant-fricative bilabial voiceless")# False# compare IPAChar and list of descriptorsmy_a1.is_equivalent([u"open",u"front",u"unrounded"])# Falsemy_a1.is_equivalent([u"vowel",u"open",u"front",u"unrounded"])# Truemy_a1.is_equivalent([u"open",u"unrounded",u"vowel",u"front"])# Truemy_a1.is_equivalent([u"low",u"fnt",u"unr",u"vwl"])# True############## IPAString ############### Def.: an IPAString is a list of IPAChar objects# check if Unicode string contains only IPA valid characterss_uni=u"əˈkiːn æˌkænˈθɑ.lə.d͡ʒi"# Unicode string of the IPA pronunciation for "achene acanthology"is_valid_ipa(s_uni)# Trueis_valid_ipa(u"LoL")# False (uppercase letter L is not IPA valid)# create IPAString from list of IPAChar objectsnew_s_ipa=IPAString(ipa_chars=[c3,c2,tS1,c1])# create IPAString from Unicode strings_ipa=IPAString(unicode_string=s_uni)# IPAString is similar to regular Python string objectprint(s_ipa)# "əˈkiːn æˌkænˈθɑ.lə.d͡ʒi"len(s_ipa)# 21s_ipa[0]# (first IPA char)s_ipa[5:8]# (6th, 7th, 8th IPA chars)s_ipa[19:]# (IPA chars from the 20th)s_ipa[-1]# (last IPA char)len(new_s_ipa)# 4new_s_ipa.append(UNICODE_TO_IPA[u"a"])# (append IPA char "a")len(new_s_ipa)# 5new_s_ipa.append(UNICODE_TO_IPA[u"t͡ʃ"])# (append IPA char "t͡ʃ")len(new_s_ipa)# 6new_s_ipa.extend(s_ipa)# (append s_ipa to new_s_ipa)len(new_s_ipa)# 27double=s_ipa+new_s_ipa# (concatenate s_ipa and new_s_ipa)len(double)# 48# new IPAString objects containing only...print(s_ipa.consonants)# "knknθld͡ʒ"                (consonants)print(s_ipa.vowels)# "əiææɑəi"                 (vowels)print(s_ipa.letters)# "əkinækænθɑləd͡ʒi"         (vowels and consonants)print(s_ipa.cns_vwl)# "əkinækænθɑləd͡ʒi"         (vowels and consonants)print(s_ipa.cns_vwl_pstr)# "əˈkinækænˈθɑləd͡ʒi"       (  + primary stress marks)print(s_ipa.cns_vwl_pstr_long)# "əˈkiːnækænˈθɑləd͡ʒi"      (    + long marks)print(s_ipa.cns_vwl_str)# "əˈkinæˌkænˈθɑləd͡ʒi"      (  + stress marks)print(s_ipa.cns_vwl_str_len)# "əˈkiːnæˌkænˈθɑləd͡ʒi"     (    + length marks)print(s_ipa.cns_vwl_str_len_wb)# "əˈkiːn æˌkænˈθɑləd͡ʒi"    (      + word breaks)print(s_ipa.cns_vwl_str_len_wb_sb)# "əˈkiːn æˌkænˈθɑ.lə.d͡ʒi"  (        + syllable breaks)cns=s_ipa.consonants# (store new IPA string)cns==s_ipa.consonants# False (two different objects)cns.is_equivalent(s_ipa.consonants)# Truecns.is_equivalent(s_ipa)# False# print representation and name of all IPAChar objects in IPAStringforcins_ipa:print(u"%s\t%s"%(c,c.name))# ə vowel mid central unrounded# ˈ suprasegmental primary-stress# k consonant voiceless velar plosive# i vowel close front unrounded# ː suprasegmental long# n consonant voiced alveolar nasal#   suprasegmental word-break# æ vowel near-open front unrounded# ˌ suprasegmental secondary-stress# k consonant voiceless velar plosive# æ vowel near-open front unrounded# n consonant voiced alveolar nasal# ˈ suprasegmental primary-stress# θ consonant voiceless dental non-sibilant-fricative# ɑ vowel open back unrounded# . suprasegmental syllable-break# l consonant voiced alveolar lateral-approximant# ə vowel mid central unrounded# . suprasegmental syllable-break# d͡ʒ   consonant voiced palato-alveolar sibilant-affricate# i vowel close front unrounded# compare IPAString objectss_ipa_d=IPAString(unicode_string=u"diff")s_ipa_1=IPAString(unicode_string=u"at͡ʃe")s_ipa_2=IPAString(unicode_string=u"aʧe")s_ipa_3=IPAString(unicode_string=u"at͡ʃe",single_char_parsing=True)s_ipa_d==s_ipa_1# Falses_ipa_1==s_ipa_2# False (different objects)s_ipa_1==s_ipa_3# False (different objects)s_ipa_2==s_ipa_3# False (different objects)s_ipa_d.is_equivalent(s_ipa_1)# Falses_ipa_1.is_equivalent(s_ipa_2)# Trues_ipa_2.is_equivalent(s_ipa_1)# Trues_ipa_1.is_equivalent(s_ipa_3)# Trues_ipa_2.is_equivalent(s_ipa_3)# True# compare IPAString and list of IPAChar objectss_ipa_1.is_equivalent([my_a1,my_tS,my_ee])# True# compare IPAString and Unicode strings_ipa_d.is_equivalent(u"diff")# Trues_ipa_1.is_equivalent(u"atse")# Falses_ipa_1.is_equivalent(u"atSe")# Falses_ipa_1.is_equivalent(u"at͡ʃe")# Trues_ipa_1.is_equivalent(u"at͜ʃe")# Trues_ipa_1.is_equivalent(u"aʧe")# Trues_ipa_1.is_equivalent(u"at͡ʃeLOL",ignore=True)# True (ignore chars non IPA valid)s_ipa_1.is_equivalent(u"at͡ʃeLoL",ignore=True)# False (ignore chars non IPA valid, note extra "o")######################### CONVERSION FUNCTIONS #########################fromipapy.kirshenbaummapperimportKirshenbaumMapperkmapper=KirshenbaumMapper()# mapper to Kirshenbaum ASCII IPAs_k_ipa=kmapper.map_ipa_string(s_ipa)# u"@'ki:n#&,k&n'TA#l@#dZi"s_k_uni=kmapper.map_unicode_string(s_uni)# u"@'ki:n#&,k&n'TA#l@#dZi"s_k_ipa==s_k_uni# Trues_k_lis=kmapper.map_unicode_string(s_uni,return_as_list=True)# [u'@', u"'", u'k', u'i', u':', u'n', u'#', u'&', u',', u'k', u'&', u'n', u"'", u'T', u'A', u'#', u'l', u'@', u'#', u'dZ', u'i']fromipapy.arpabetmapperimportARPABETMapperamapper=ARPABETMapper()# mapper to ARPABET ASCII IPA (stress marks not supported yet)s_a=amapper.map_unicode_string(u"pɹuːf")# error: long suprasegmental not mappeds_a=amapper.map_unicode_string(u"pɹuːf",ignore=True)# u"PRUWF"s_a=amapper.map_unicode_string(u"pɹuːf",ignore=True,return_as_list=True)# [u'P', u'R', u'UW', u'F']

作为命令行工具

ipapy附带了一个命令行工具，可以在给定Unicode UTF-8编码字符串，表示IPA字符串。因此，建议在支持utf-8的shell上运行它。

目前，支持的操作有：

canonize：规范ipa字符串的unicode表示形式
chars：列出出现在ipa字符串中的所有ipa字符
check：检查给定的unicode字符串是否是ipa有效的
clean：删除IPA无效的字符
u2a：打印相应的arpabet（ascii ipa）字符串
u2k：打印相应的kirshenbaum（ascii ipa）字符串

使用--help参数运行以列出所有可用选项：

$ python -m ipapy --help

usage: __main__.py [-h][-i][-p][--separator [SEPARATOR]][-s][-u]command string

ipapy perform a command on the given IPA/Unicode string

positional arguments:
  command[canonize|chars|check|clean|u2a|u2k]
  string                String to canonize, check, clean, or convert

optional arguments:
  -h, --help            show this help message and exit
  -i, --ignore          Ignore Unicode characters that are not IPA valid
  -p, --print-invalid   Print Unicode characters that are not IPA valid
  --separator [SEPARATOR]
                        Print IPA chars separated by this character (default:
                        '')
  -s, --single-char-parsing
                        Perform single character parsing instead of maximal
                        parsing
  -u, --unicode         Print each Unicode character that is not IPA valid
                        with its Unicode codepoint and name

示例：

$ python -m ipapy canonize "eʧiu"
et͡ʃiu

$ python -m ipapy canonize "eʧiu" --separator " "
e t͡ʃ i u

$ python -m ipapy chars "eʧiu"'e' vowel close-mid front unrounded (U+0065)'t͡ʃ'   consonant voiceless palato-alveolar sibilant-affricate (U+0074 U+0361 U+0283)'i' vowel close front unrounded (U+0069)'u' vowel close back rounded (U+0075)

$ python -m ipapy chars "et͡ʃiu"'e' vowel close-mid front unrounded (U+0065)'t͡ʃ'   consonant voiceless palato-alveolar sibilant-affricate (U+0074 U+0361 U+0283)'i' vowel close front unrounded (U+0069)'u' vowel close back rounded (U+0075)

$ python -m ipapy chars "et͡ʃiu" -s
'e' vowel close-mid front unrounded (U+0065)'t' consonant voiceless alveolar plosive (U+0074)'͡' diacritic tie-bar-above (U+0361)'ʃ' consonant voiceless palato-alveolar sibilant-fricative (U+0283)'i' vowel close front unrounded (U+0069)'u' vowel close back rounded (U+0075)

$ python -m ipapy check "eʧiu"
True

$ python -m ipapy check "LoL"
False

$ python -m ipapy check "LoL" -p
False
LL

$ python -m ipapy check "LoLOL" -p -u
False
LLOL
'L' 0x4c    LATIN CAPITAL LETTER L
'O' 0x4f    LATIN CAPITAL LETTER O

$ python -m ipapy clean "/eʧiu/"
eʧiu

$ python -m ipapy u2k "eʧiu"
etSiu

$ python -m ipapy u2k "eTa"
The given string contains characters not IPA valid. Use the 'ignore' option to ignore them.

$ python -m ipapy u2k "eTa" -i
ea

$ python -m ipapy u2a "eʧiu" --separator " "
EH CH IH UW

单元测试

$ python run_all_unit_tests.py

许可证

ipapy在mit许可下发布。

致谢

bram vanroy为windows用户提供了一个setup.py修复程序

欢迎加入QQ群-->： 979659372

ipapy 0.0.9.0

ipap的Python项目详细描述

安装

用法

作为python模块

作为命令行工具

单元测试

许可证

致谢

推荐PyPI第三方库

mobPushSdkV3

dynamodbgeo

prosper

pytiledparser

blokus-gym

tweedledum

mindset

nim4p

pb-common

djangorestframeworkcamelcase

evolut

inveniobase

asksdkcore

dylan

agrc-sweeper

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

ipapy 0.0.9.0

ipap的Python项目详细描述

安装

用法

作为python模块

作为命令行工具

单元测试

许可证

致谢

推荐PyPI第三方库

mobPushSdkV3

dynamodbgeo

prosper

pytiledparser

blokus-gym

tweedledum

mindset

nim4p

pb-common

djangorestframeworkcamelcase

evolut

inveniobase

asksdkcore

dylan

agrc-sweeper

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签