如何将光谱图图像与其人类标记的数据结合起来,以便在Python中使用CNN进行处理?

2024-10-02 22:37:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在校园里做最后一个项目:利用CNN从一首歌中估算音高

CNN的输入是一首歌的谱图,由plt.specgram()生成,大小为334 x 217。歌曲数据集取自MIR-QBSH,具有以下规范:8秒持续时间,单声道,8KHz采样,8位量化,帧大小=256,重叠=0,第一帧从音频文件的第一个样本开始

这是光谱图的一个示例:
Spectrogram example

就我现在所知,我需要数据标签(在我的例子中:音高标签)与频谱图相结合,以便CNN能够处理计算。我的数据标签包含一首歌曲的250个音高标签。这些音高标签以半音(MIDI编号)为单位

这是上面光谱图的音高标签示例。为了简化计算,我对原始文件中的这些音高标签使用了math.floor()方法

Pitch values:  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 48, 50, 50, 52, 52, 53, 54, 54, 53, 53, 53, 53, 54, 54, 54, 54, 54, 53, 0, 0, 54, 54, 54, 54, 54, 54, 53, 0, 0, 0, 0, 46, 46, 46, 47, 48, 48, 48, 48, 48, 49, 49, 49, 50, 50, 50, 50, 0, 0, 0, 50, 50, 50, 50, 50, 50, 50, 49, 0, 0, 51, 47, 47, 47, 47, 47, 47, 47, 47, 48, 49, 50, 0, 0, 0, 0, 0, 0, 0, 0, 0, 58, 57, 58, 58, 58, 58, 58, 57, 57, 57, 57, 57, 57, 57, 58, 58, 57, 57, 57, 57, 56, 55, 55, 56, 56, 56, 56, 56, 55, 56, 56, 56, 56, 55, 55, 55, 56, 56, 54, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 52, 52, 52, 53, 53, 54, 54, 0, 0, 54, 54, 54, 54, 54, 54, 54, 54, 0, 0, 0, 54, 54, 54, 54, 54, 53, 52, 51, 47, 47, 47, 47, 47, 47, 47, 47, 47, 48, 49, 49, 49, 49, 50, 50, 50, 50, 49, 0, 0, 0, 50, 49, 49, 49, 49, 49, 50, 50, 49, 0, 0, 47, 47, 47, 48, 48, 48, 48, 0, 0, 0, 0, 0, 0, 0]

我的问题是,在CNN用Python处理光谱图之前,我应该如何组合光谱图及其音调标签


Tags: 数据项目规范利用示例plt光谱标签
1条回答
网友
1楼 · 发布于 2024-10-02 22:37:29

我已经解决了我的问题。它的工作原理如下:

    image_data = []
    tm = time.time()
    for img_item in os.listdir(image_path): #for every image in path
        try:
          img_array = cv2.imread(os.path.join(image_path, img_item))

          spectrogram_preprocessing = resize_recolor_spectrogram(img_array) # convert image to grayscale and resize it to 250 x 160

          # imread to array
          spectrogram_preprocessing = np.array(spectrogram_preprocessing)

          label = extract_pitch_label(os.path.join(label_path, img_item))

          # combining label and image
          image_data.append([spectrogram_preprocessing, label])
        except Exception as e:
          raise e

相关问题 更多 >