我正在使用googleapi进行语音识别。在
我使用2.5秒的音频样本。下面,您可以看到省略置信度的输出示例:
{u'alternative': [{u'transcript': u'if Carol comes tomorrow have a'}, {u'transcript': u'if Carroll comes tomorrow never'}, {u'transcript': u'if Carroll comes tomorrow have a'}, {u'transcript': u'if Carole comes tomorrow have a'}, {u'transcript': u'if care comes tomorrow have a'}, {u'transcript': u'if Carroll comes tomorrow however'}, {u'transcript': u'if girl comes tomorrow have a'}, {u'transcript': u'is Carroll comes tomorrow have a'}, {u'transcript': u'if call comes tomorrow have a'}, {u'transcript': u'Carol comes tomorrow have a'}, {u'transcript': u'if kevin comes tomorrow have a'}, {u'transcript': u'if Carroll comes tomorrow have'}, {u'transcript': u'if korea comes tomorrow have a'}, {u'transcript': u'if Carroll come tomorrow have a'}, {u'transcript': u'if cry comes tomorrow have a'}], u'final': True}
原始样品在末端部分切割,但明确表示: “如果卡罗尔明天来的话,就请你……”
在第一句话中,只有第一句话的值被省略了:
{u'alternative': [{u'confidence': 0.91297865, u'transcript': u'by that time perhaps something better can'}, {u'transcript': u'by that time perhaps something better came'}, {u'transcript': u'by that time perhaps something better Kim'}, {u'transcript': u'but that time perhaps something better can'}, {u'transcript': u'by that time perhaps something better come'}], u'final': True}
这里的句子是:“到那时,也许可以有更好的东西”。所以第一次转录非常准确。在
以防万一,下面是我在Python中运行求值的方法:
import speech_recognition as sr
from scipy.io import wavfile
r = sr.Recognizer()
with sr.WavFile(target0_path) as source:
audio = r.record(source)
list = r.recognize_google(audio, None, "en-US", True)
你有什么想法或建议吗?有什么特殊的设置可以用来避免这个问题吗?在
目前没有回答
相关问题 更多 >
编程相关推荐