我用pyaudio创建了非常简单的语音聊天,但是声音有点像meh。你通常会听到一些像老电影里那样的噪音。这可能是由于我通过UDP发送的语音块丢失造成的。有没有可能降低噪音? 此外,我想播放一个声音,当用户移动到一个按钮,但由于某些原因,这是不可能合并这两个轨道(音效和声音)!在
这是最重要的课程“声音”。它在线程中运行,所以可以在永久循环中运行。在
import numpy as np
import pyaudio
import wave # I play the buttons effects from wav file
class Sound():
WIDTH = 2
CHANNELS = 2
RATE = 44100
def __init__(self, parent = None):
super(Sound, self).__init__(parent)
self.voiceStreams= []
self.effectStreams= []
self.vVolume= 1
self.eVolume= 0.5
self.voip = None
self.p = pyaudio.PyAudio()
self.stream = self.p.open(format = self.p.get_format_from_width(Sound.WIDTH),
channels = Sound.CHANNELS,
rate = Sound.RATE,
input = True,
output = True,
#stream_callback = self.callback
)
self.nextSample = ""
self.lastSample = ""
self.stream.start_stream()
def run(self):
while True:
self.myCallback()
def myCallback(self):
_time = time.clock()
if self.nextSample:
self.stream.write(self.nextSample)
self.lastSample = self.nextSample
elif self.lastSample: # I have got some crazy idea, that when there are no data (because UDP doesnt deliver them) I could play the last data, so nobody hears the short silence noise
self.stream.write(self.lastSample)
self.lastSample = ""
_time = time.clock()
#print ("{0:d} ---- {1:d} --- timeWrite: {2:.1f}".format(len(self.voiceStreams), self.stream.get_read_available(), (time.clock() - _time)* 1000) , end = " ")
if self.stream.get_read_available() > 1023:
mic = self.stream.read(1024)
else:
mic = ""
#print ("timeRead: {0:.1f}".format( (time.clock() - _time)* 1000) , end = " ")
if mic and self.voip: self.voip.sendDatagram(mic) #This sends the CHUNK of sound to my UDP client
_time = time.clock()
data = np.zeros(2048, np.int64)
length = len(self.voiceStreams) # I read voice data
l1 = length
for i in range(length):
s = self.voiceStreams.pop(0)
data += s / length * self.vVolume * 0.4 # Here i merge multiple voices with numpy. I also reduce the volume of each voice based on how voices I have...
length = len(self.effectStreams)
toPop= [] # Here i hold indexes of effects which ended playing
for i in range(length):
s = self.effectStreams[i].readframes(1024)
if s == "": # If there are no data to play
toPop.append(i - len(toPop))
else:
d = np.fromstring(s, np.int16)
# Sadly enough each numpy must have same length, so if I get to the end of track, which has only length of 1500 I must throw that away, because numpy doesnt allow me to merge it with array of length 2048
if len(d) > 2047: # And again I merge the sounds with numpy and I reduce the volume
data += (d/ length * length) * self.eVolume * 0.3
for i in toPop: # If I am at the end of track, I delete it
del self.effectStreams[i]
if np.any(data): # If there are any data to read
self.nextSample = data.astype(np.int16).tostring() #I prepare the next CHUNK (should be 20 ms, but I am not sure)
else:
self.nextSample = ""
#print ("timeRest: {0:.1f}".format( (time.clock() - _time)* 1000), end = " || ")
print("HOW MANY CHUNKS OF VOICE I GOT: ", l1)
# It is weird, that when i print the times of reading and writing to stream, it usually prints something like this: (20ms, 20ms, 30ms, 20ms, 20ms, 30ms, 20ms ...)
def close(self):
self.timer.stop()
self.stream.stop_stream()
self.stream.close()
self.p.terminate()
UDP服务器和客户端非常简单(它们工作得很好,所以我不在这里发布它们)。客户机只将所有数据发送到服务器,而服务器在获得任何数据时将所有数据发送给所有客户机。我不告诉任何人,谁发的数据。这意味着如果数据传输太晚,我将同时播放来自一个客户机的两个数据块(因为我认为它们来自多个客户机)!在
以下是wav文件:Dropbox repository 我没有创建它们,我是从站点http://www.freesound.org/people/ERH/sounds/31135/下载的,它们是在属性下授权的!在
!!我还加了一句输出.txt在dropbox文件中,显示了在两个人之间运行这个例子时python输出了什么(我只从一个用户那里获得语音数据)。在
谢谢你的建议。在
目前没有回答
相关问题 更多 >
编程相关推荐