合成后无法播放IBM Watson文本到语音音频文件

2024-05-20 08:36:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在做的是写入音频输出文件,等待文件存在且大小不是0,然后播放它(我尝试了许多不同的库,如subprocess、playsound、pygame、vlc等。我也尝试了许多不同的文件类型mp3、wav等),但出于某种原因,我收到一个错误,说它没有关闭或已损坏。它偶尔会播放一次,但一旦播放另一个沃森制作的mp3,它就会再次出错。有人知道解决办法吗

...
from ibm_watson import TextToSpeechV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
...
authenticator = IAMAuthenticator(ibmApiKey);
textToSpeech = TextToSpeechV1(authenticator = authenticator);
textToSpeech.set_service_url(ibmServiceUrl);
...
file = str(int(random.random() * 100000)) + ".mp3";
    with open(file, "wb") as audioFile:
        audioFile.write(textToSpeech.synthesize(text, voice = "en-GB_JamesV3Voice", accept = "audio/mp3").get_result().content);

    fileExists = False;

    while (fileExists == False):
        if (os.path.isfile(file)):
            fileExists = os.stat(file).st_size != 0;
            playsound(file);
            os.remove(file);
Error 263 for command:
        open temp/77451.mp3
    The specified device is not open or is not recognized by MCI.

    Error 263 for command:
        close temp/77451.mp3
    The specified device is not open or is not recognized by MCI.
Failed to close the file: temp/77451.mp3
Traceback (most recent call last):
  File "main.py", line 457, in <module>
    runMain(name, config.get("main", "callName"), voice);
  File "main.py", line 156, in runMain
    speak("The time is: " + datetime.now().strptime(datetime.now().time().strftime("%H:%M"), "%H:%M").strftime("%I:%M %p"), voice);
  File "main.py", line 123, in speak
    playsound(file);
  File "C:\Users\turtsis\AppData\Local\Programs\Python\Python35-32\lib\site-packages\playsound.py", line 72, in _playsoundWin
    winCommand(u'open {}'.format(sound))
  File "C:\Users\turtsis\AppData\Local\Programs\Python\Python35-32\lib\site-packages\playsound.py", line 64, in winCommand
    raise PlaysoundException(exceptionMessage)
playsound.PlaysoundException:
    Error 263 for command:
        open temp/77451.mp3
    The specified device is not open or is not recognized by MCI.

Tags: theinpyauthenticatorismainlinenot
1条回答
网友
1楼 · 发布于 2024-05-20 08:36:50

这个bug可能存在于不同的地方

首先,我想试试这个:

from ibm_watson import ApiException
try:
    file = str(int(random.random() * 100000)) + ".mp3";
        with open(file, "wb") as audioFile:
            audioFile.write(textToSpeech.synthesize(text, voice = "en-GB_JamesV3Voice", accept = "audio/mp3").get_result().content);
except ApiException as ex:
    print ("Method failed with status code " + str(ex.code) + ": " + ex.message)

如果对Watson的调用返回错误,则可能会将您弹出运行时

但是,如果问题出在playsound上,我建议采用以下方法:

import pyttsx3
from ibm_watson import ApiException

engine = pyttsx3.init()
try:
    file = str(int(random.random() * 100000)) + ".mp3";
        with open(file, "wb") as audioFile:
            audioFile.write(textToSpeech.synthesize(text, voice = "en-GB_JamesV3Voice", accept = "audio/mp3").get_result().content);

        fileExists = False;

        while (fileExists == False):
            if (os.path.isfile(file)):
                fileExists = os.stat(file).st_size != 0;
                engine.say(file);
                os.remove(file); 
                engine.runAndWait()          

except ApiException as ex:
    print ("Method failed with status code " + str(ex.code) + ": " + ex.message)

如果两者都不起作用,我将尝试使用curl,看看您是否可以复制您的场景:

Replace {apikey} and {url} with your API key and URL.


curl -X POST -u "apikey:{apikey}"  header "Content-Type: application/json"  data "{\"text\":\"hello world\"}"  output hello_world.ogg "{url}/v1/synthesize?voice=en-US_AllisonV3Voice"

祝你好运

相关问题 更多 >