aws polly api的异步包装
aiopoll的Python项目详细描述
基本上
aiopoly是一个用于Amazon Polly API的异步库,它使用asyncio和aiohttp编写,并使用pydantic模型
功能
- 异步
- 尊重PEP-8(无驼峰型参数和变量)
- 提供了使用SSML标签和词典 的简单方法
- 已映射并分类AWS API异常
- 具有音频转换支持和内置的异步opus转换器
安装
$ pip install aiopolly
入门
要使用aws polly,您需要aws帐户、iam用户及其凭据,here's the instructions如何获取它
然后可以使用以下两种方法之一初始化该类:
直接提供访问和密钥:
fromaiopollyimportPollypolly=Polly(access_key='your_access_key',secret_key='your_secret_key')
使用以下数据创建~/.aws/credentials文件:
[default] aws_access_key_id = your_access_key aws_secret_access_key = your_secret_key
以及没有任何auth参数的init类:
fromaiopollyimportPollypolly=Polly()
示例
许多声音
importasyncioimporttimefromaiopollyimportPolly,typesasyncdefmain():time_start=time.time()# Initializing AWS Polly client with default output_formatpolly=Polly(output_format=types.AudioFormat.mp3)voices=awaitpolly.describe_voices()text='Whatever you can do I can override it, got a million ways to synthesize it'# Asynchronously synthesizing text with all available voicessynthesized=awaitvoices.synthesize_speech(text,language_code=types.LanguageCode.en_US)# Asynchronously saving each synthesized audio on diskawaitasyncio.gather(*(speech.save_on_disc(directory='examples')forspeechinsynthesized))# Counting how many characters were synthesizedcharacters_synthesized=sum(speech.request_charactersforspeechinsynthesized)print(f'{characters_synthesized} characters are synthesized on {len(synthesized)}speech'f'and saved on disc in {time.time() - time_start} seconds!')loop=asyncio.get_event_loop()loop.run_until_complete(main())< H2>管理词库
importasynciofromaiopollyimportPollyfromaiopolly.typesimportAlphabet,AudioFormat,LanguageCode,VoiceIDfromaiopolly.utils.lexiconimportnew_lexicon,new_lexemeasyncdefmain():# Creating a new Polly instance with default output format 'mp3'polly=Polly(output_format=AudioFormat.mp3)text='Python is a beautiful programming language which is commonly used for web backend and ML. ' \ 'It also has cool style guides listed in PEP-8, and many community libraries like aiopolly or aiogram.'# Creating new lexemespython_lexemes=[new_lexeme(grapheme='PEP',alias='Python Enhancement Proposals'),new_lexeme(grapheme='ML',alias='Machine Learning'),new_lexeme(grapheme='aiopolly',phoneme='eɪˈaɪoʊˈpɑli'),new_lexeme(grapheme='aiogram',phoneme='eɪˈaɪoʊˌgræm')]# Creating a new lexicon with 'ipa' alphabet and 'en_US' language codelexicon=new_lexicon(alphabet=Alphabet.ipa,lang=LanguageCode.en_US,lexemes=python_lexemes)# Putting lexicon on Amazon serverlexicon_name='PythonML'awaitpolly.put_lexicon(lexicon_name=lexicon_name,content=lexicon)# Synthesizing speech with lexicon we just created # (we don't need to specify required param "output_format", as we using it by default)speech=awaitpolly.synthesize_speech(text,voice_id=VoiceID.Matthew,lexicon_names=[lexicon_name])# Saving speech on disk with default nameawaitspeech.save_on_disc()loop=asyncio.get_event_loop()loop.run_until_complete(main())
使用SSML文本
基本上都有内置的ssml文本工厂,您可以使用它来管理ssml文本:
importasynciofromaiopollyimportPollyfromaiopolly.typesimportAudioFormat,VoiceID,TextTypefromaiopolly.utils.ssmlimportssml_text,prosodyfromaiopolly.utils.ssml.paramsimportVolume,Pitch,Ratesuper_fast=prosody(f'''\Uh, sama lamaa duma lamaa you assuming I'm a human\What I gotta do to get it through to you I'm superhuman\Innovative and I'm made of rubber\So that anything you say is ricocheting off of me and it'll glue to you\I'm devastating more than ever demonstrating\How to give a motherfuckin' audience a feeling like it's levitating\Never fading and I know that the haters are forever waiting\For the day that they can say I fell off they'd be celebrating\'Cause I know the way to get 'em motivated''',rate=Rate.x_fast,volume=Volume.x_loud,pitch=Pitch.high)asyncdefmain():# Creating a new Polly instance with default output format 'mp3'polly=Polly(output_format=AudioFormat.mp3)text=ssml_text(super_fast)speech=awaitpolly.synthesize_speech(text,voice_id=VoiceID.Matthew,text_type=TextType.ssml)awaitspeech.save_on_disc(directory='speech')loop=asyncio.get_event_loop()loop.run_until_complete(main())
使用默认参数
可以使用任何默认参数初始化polly客户端。 当api方法中的相同参数保持为空时,将使用这些参数。
fromaiopollyimportPolly,typespolly=Polly(voice_id=types.VoiceID.Joanna,output_format=types.AudioFormat.ogg_vorbis,sample_rate='16000',speech_mark_types=['ssml'],text_type=types.TextType.ssml,language_code=types.LanguageCode.en_US,lexicon_names=['myLexicon','alsoMyLexicon'],output_s3_key_prefix='s3_key_prefix',output_s3_bucket_name='s3_bucket_name',include_additional_language_codes=True,**{'other_default_param':'value'})
使用内置OpusConverter
为此,您需要在系统上安装ffmpeg和libopus
importasynciofromaiopollyimportPollyfromaiopolly.typesimportAudioFormat,TextType,VoiceIDfromaiopolly.utils.converterimportOpusConverterfromaiopolly.utils.ssmlimportssml_text,pause,Strengthasyncdefmain():# Creating instance if OpusConverterconverter=OpusConverter(auto_convert=True,keep_original=True)polly=Polly(output_format=AudioFormat.mp3,converter=converter)text=ssml_text(f'''sendVoiceUse this method to send audio files, if you want Telegram clients to display the file as a playable voice message. For this to work, your audio must be in an {pause(Strength.none)}.ogg file encoded with OPUS (other formats may be sent as Audio or Document)''')# Synthesizing speech as usual, it will be converted automaticallyspeech=awaitpolly.synthesize_speech(text,voice_id=VoiceID.Matthew,text_type=TextType.ssml)# Saving speech on disk with default nameawaitspeech.save_on_disc(directory='speech')awaitspeech.save_on_disc(directory='speech',converted=False)loop=asyncio.get_event_loop()loop.run_until_complete(main())
待办事项:
- 测试合成任务(尚未测试)
- 编写测试
- 去掉botocore(需要内置请求签名者)
- 使用转换器api?
- 更多的医生?