使用websockets连接到Watson SpeechtoText API以进行实时转录

from ws4py.client.threadedclient import WebSocketClient import base64, json, ssl, subprocess, threading, time class SpeechToTextClient(WebSocketClient): def __init__(self): ws_url = "wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize" username = "your username" password = "your password" auth_string = "%s:%s" % (username, password) base64string = base64.encodestring(auth_string).replace("\n", "") self.listening = False try: WebSocketClient.__init__(self, ws_url, headers=[("Authorization", "Basic %s" % base64string)]) self.connect() except: print "Failed to open WebSocket." def opened(self): self.send('{"action": "start", "content-type": "audio/l16;rate=16000"}') self.stream_audio_thread = threading.Thread(target=self.stream_audio) self.stream_audio_thread.start() def received_message(self, message): message = json.loads(str(message)) if "state" in message: if message["state"] == "listening": self.listening = True print "Message received: " + str(message) def stream_audio(self): while not self.listening: time.sleep(0.1) reccmd = ["arecord", "-f", "S16_LE", "-r", "16000", "-t", "raw"] p = subprocess.Popen(reccmd, stdout=subprocess.PIPE) while self.listening: data = p.stdout.read(1024) try: self.send(bytearray(data), binary=True) except ssl.SSLError: pass p.kill() def close(self): self.listening = False self.stream_audio_thread.join() WebSocketClient.close(self) try: stt_client = SpeechToTextClient() raw_input() finally: stt_client.close()

2条回答

网友

1楼 · 编辑于 2024-06-26 04:04:33

关于如何使用R实现这一点的一些好例子，请查看ryananderson的这些很棒的博客文章。在

Voice Controlled Music Machine
Python as a tool to help with Continuous Audio-这展示了如何使用R作为主逻辑，以及如何使用Python处理音频。在

Ryan在R和Watson API上做了很多工作—他分享了他在blog方面的很多知识。在

网友

2楼 · 编辑于 2024-06-26 04:04:33

不确定这个答案是否正是您想要的，但听起来像是参数continuous的问题。在

如您所见，lib Python SDK位于Watson开发人员云中。在

您可以使用安装：pip install watson-developer-cloud

import json
from os.path import join, dirname
from watson_developer_cloud import SpeechToTextV1

speech_to_text = SpeechToTextV1(
    username='YOUR SERVICE USERNAME',
    password='YOUR SERVICE PASSWORD',
    x_watson_learning_opt_out=False
)

print(json.dumps(speech_to_text.models(), indent=2))

print(json.dumps(speech_to_text.get_model('en-US_BroadbandModel'), indent=2))

with open(join(dirname(__file__), '../resources/speech.wav'),
          'rb') as audio_file:
data = json.dumps(speech_to_text.recognize(audio_file, content_type='audio/wav', timestamps=False, word_confidence=False, continuous=True), indent=2)
print(data)

Obs.：服务返回array个结果，每个语句一个。在

在#L44行中，有您可以使用的params，因此，对于连续转录，您需要使用参数continuous，并像上面的例子一样设置为true。在

请参阅Official Documentation讨论Websockets以保持连接的有效性。（也许这就是你需要的）。在

相关问题更多 >

编程相关推荐

热门问题

热门文章