两个光谱图之间的相似性指数

2024-10-03 09:08:23 发布

您现在位置:Python中文网/ 问答频道 /正文

所以我想得到代表两个歌手声音的2D数组之间的相似性指数。 我用pydub读我的mp3文件。 AudioFunctions.py

from pydub import AudioSegment

class SongData():
    def __init__(self, path):
        self.audio = AudioSegment.from_file(path).set_channels(1)
        self.rate, self.data = self.audio.frame_rate, np.array(self.audio.get_array_of_samples())
        self.length = len(self.data)
        self.duration = self.audio.duration_seconds
        self.time = np.linspace(0, self.duration, self.length)
        self.freq = np.linspace(0, self.rate / 2, int(self.length / 2))
        self.fftArray = fft(self.data)
        self.fftArrayPositive = self.fftArray[:self.length // 2]
        self.fftArrayNegative = np.flip(self.fftArray[self.length // 2:])
        self.fftArrayAbs = np.abs(self.fftArray)
        self.fftPlotting = self.fftArrayAbs[: self.length // 2]


def song2data(path):
    songClass = SongData(path)
    return songClass


def getFirstData(songArr, time):
    selectedData = songArr[:int(time*44100)]
    return selectedData

这是我的代码,用来获取两首歌曲的数据和它们的光谱图

main.py

from AudioFunctions import *
from scipy import signal
import matplotlib.pyplot as plt
import librosa
import sklearn
from scipy import spatial
from sklearn.metrics.pairwise import cosine_similarity



songClass1 = song2data("sia1.mp3")
songClass2 = song2data("sia2.mp3")

# print(songClass.data)
# print(songClass.rate)
# print(songClass.duration)
# print(songClass.length)

songArray = getFirstData(songClass1.data, 120)
songArray2 = getFirstData(songClass2.data, 120)


frequencies, times, spectrogram = signal.spectrogram(songArray, 44100)
frequencies2, times2, spectrogram2 = signal.spectrogram(songArray2, 44100)

# print(frequencies)
spec = spectrogram.flatten()
spec2 = spectrogram2.flatten()


result = 1 - spatial.distance.cosine(spec, spec2)
print(result)

结果表示两种声音之间的相似性指数。然而,当比较同一个歌手(Sia)的两首歌曲时,我得到的数字很低(0.133)

歌曲1:https://drive.google.com/file/d/1svV0Ry_lNaEA9Z8c61t3S25XCSgIv6sW/view

歌曲2:https://drive.google.com/file/d/1ToKQo2MERBbxZezqDcEtEE2dhgmH-wus/view

我的逻辑有问题吗?或者这个结果在某些情况下是合乎逻辑的? 提前谢谢


Tags: pathfromimportselfdataratenpmp3