相同长度音频片段的不同FFT信号长度

def extractFFT(audioArr): fourierArr = [] fourierComplex = [] for x in range(len(audioArr)): y, sr = lb.load(audioArr[x]) fourier = np.fft.fft(y) fourier = fourier.real fourierArr.append(fourier) return fourierArr

def LDA(frequencyArr): splitMark = int(len(frequencyArr)*0.8) trainingData = frequencyArr[:splitMark] validationData = frequencyArr[splitMark:] labels = [1,1,2,2] lda = LinearDiscriminantAnalysis() lda.fit(trainingData,labels[:splitMark]) print(f"prediction: {lda.predict(validationData)}")

1条回答

网友

1楼 · 发布于 2024-10-04 05:25:25

首先：不要只取变换结果的真实部分。这对你没有任何好处。使用功率（r^2+i^2）或幅度（sqrt(power)）来获得频率单元的信号强度。你知道吗

Does this have something to do with the audio clips? After the transform, some audio clips are of equal lengths, others are not. If someone could explain why these same length audio clips can return different length FFT's, that would be great!

它们的长度根本不一样。我打赌你剪辑的样本数不完全相同。你知道吗

在y, sr = lb.load(audioArr[x])做print('sample count = {}'.format(len(y)))之后，你很可能会看到不同的值（你自己也说过）。你知道吗

正如您已经指出的，当然您可以简单地在min(len(y))处剪切信号，然后将其输入FFT。但通常情况下，您要做的是使用discrete STFT，它具有固定的窗口大小。这确保了FFT的长度输入大小相同。您可以使用librosa's implementation作为一个简单的起点。文件还解释了如何获得量级/功率。你知道吗

所以不是：

y, sr = lb.load(audioArr[x])
fourier = np.fft.fft(y)
fourier = fourier.real
fourierArr.append(fourier)

您需要：

y, sr = lb.load(audioArr[x])
# get the magnitudes
D = np.abs(librosa.stft(y, n_fft=4096))  # use 4096 as window length
fourierArr.append(D[0])                  # only use the first frame of the STFT

本质上，如果使用不同长度输入的Fourier变换，将得到不同长度的输出，这是LDA在使用此输出作为训练数据时无法原谅的。所以你必须确保你的输入有相同的长度。最简单的方法是使用STFT（或者简单地将所有输入剪切到min）。在国际海事组织，这是没有什么不干净的，它不会影响结果太多，如果你错过了几个样品。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章