如何使用scipy.signal.resample将语音信号从44100降到8000赫兹?

2024-10-09 20:27:19 发布

您现在位置:Python中文网/ 问答频道 /正文

fs, s = wav.read('wave.wav')

这个信号有44100赫兹的采样频率,我想用 scipy.signal.resample(s,s.size/5.525)但是第二个元素不能是浮点数,那么,我们如何使用这个函数重新映射语音信号呢?

如何在python中使用scipy.signal.resample将语音信号从44100降到8000 Hz?


Tags: 函数元素readsizesignal信号语音scipy
2条回答

好吧,那么,另一个解决方案,这是一个真正的scipy。只是要求什么。

这是scipy.signal.resample()的doc字符串:

"""
Resample `x` to `num` samples using Fourier method along the given axis.

The resampled signal starts at the same value as `x` but is sampled
with a spacing of ``len(x) / num * (spacing of x)``.  Because a
Fourier method is used, the signal is assumed to be periodic.

Parameters
----------
x : array_like
    The data to be resampled.
num : int
    The number of samples in the resampled signal.
t : array_like, optional
    If `t` is given, it is assumed to be the sample positions
    associated with the signal data in `x`.
axis : int, optional
    The axis of `x` that is resampled.  Default is 0.
window : array_like, callable, string, float, or tuple, optional
    Specifies the window applied to the signal in the Fourier
    domain.  See below for details.

Returns
-------
resampled_x or (resampled_x, resampled_t)
    Either the resampled array, or, if `t` was given, a tuple
    containing the resampled array and the corresponding resampled
    positions.

Notes
-----
The argument `window` controls a Fourier-domain window that tapers
the Fourier spectrum before zero-padding to alleviate ringing in
the resampled values for sampled signals you didn't intend to be
interpreted as band-limited.

If `window` is a function, then it is called with a vector of inputs
indicating the frequency bins (i.e. fftfreq(x.shape[axis]) ).

If `window` is an array of the same length as `x.shape[axis]` it is
assumed to be the window to be applied directly in the Fourier
domain (with dc and low-frequency first).

For any other type of `window`, the function `scipy.signal.get_window`
is called to generate the window.

The first sample of the returned vector is the same as the first
sample of the input vector.  The spacing between samples is changed
from dx to:

    dx * len(x) / num

If `t` is not None, then it represents the old sample positions,
and the new sample positions will be returned as well as the new
samples.

"""

你应该知道,8000赫兹意味着一秒钟的信号包含8000个样本,而对于44100赫兹,这意味着一秒钟包含44100个样本。

然后,只需计算8000赫兹需要多少样本,并将该数字用作scipy.signal.resample()的第二个参数。

你可以使用Nathan Whitehead在一个重采样函数中使用的方法,我在另一个答案中使用了这个方法(带缩放)

或者通过时间,例如

secs = len(X)/44100.0 # Number of seconds in signal X
samps = secs*8000     # Number of samples to downsample
Y = scipy.signal.resample(X, samps)

这是我从Nathan Whitehead编写的SWMixer模块中选择的:

import numpy

def resample(smp, scale=1.0):
    """Resample a sound to be a different length
    Sample must be mono.  May take some time for longer sounds
    sampled at 44100 Hz.

    Keyword arguments:
    scale - scale factor for length of sound (2.0 means double length)
    """
    # f*ing cool, numpy can do this with one command
    # calculate new length of sample
    n = round(len(smp) * scale)
    # use linear interpolation
    # endpoint keyword means than linspace doesn't go all the way to 1.0
    # If it did, there are some off-by-one errors
    # e.g. scale=2.0, [1,2,3] should go to [1,1.5,2,2.5,3,3]
    # but with endpoint=True, we get [1,1.4,1.8,2.2,2.6,3]
    # Both are OK, but since resampling will often involve
    # exact ratios (i.e. for 44100 to 22050 or vice versa)
    # using endpoint=False gets less noise in the resampled sound
    return numpy.interp(
        numpy.linspace(0.0, 1.0, n, endpoint=False), # where to interpret
        numpy.linspace(0.0, 1.0, len(smp), endpoint=False), # known positions
        smp, # known data points
        )

所以,如果你用的是scipy,那意味着你也有numpy。如果scipy不是“必须的”,那么使用这个,它可以完美地工作。

相关问题 更多 >

    热门问题