不同大小numpy数组元素的条件

words = [ {'id': 0, 'word': 'Stu', 'sampleStart': 882, 'sampleEnd': 40571}, {'id': 0, 'word': ' ', 'sampleStart': 40570, 'sampleEnd': 44540}, {'id': 0, 'word': 'eyes', 'sampleStart': 44541, 'sampleEnd': 66590}, ] phonemes = [ {'id': 0, 'phoneme': ' ', 'sampleStart': 0, 'sampleEnd': 881}, {'id': 1, 'phoneme': 's', 'sampleStart': 882, 'sampleEnd': 7937}, {'id': 2, 'phoneme': 't', 'sampleStart': 7938, 'sampleEnd': 11906}, {'id': 3, 'phoneme': 'u', 'sampleStart': 11907, 'sampleEnd': 15433}, {'id': 3, 'phoneme': ' ', 'sampleStart': 15434, 'sampleEnd': 47627}, {'id': 3, 'phoneme': 'eye', 'sampleStart': 47628, 'sampleEnd': 57770}, {'id': 3, 'phoneme': 's', 'sampleStart': 57771, 'sampleEnd': 66590}, ] associatedData = [] for w in words: startWord = w['sampleStart'] endWord = w['sampleEnd'] word = w['word'] w_id = w['id'] for p in phonemes: startPhoneme = p['sampleStart'] endPhoneme = p['sampleEnd'] phoneme = p['phoneme'] p_id = p['id'] if startPhoneme >= startWord and endPhoneme <= endWord: # I need to relate this data as it comes from 2 different sources # Some computations occur here that are too ling to reproduce here, this multiplication is just to give an example mult = startPhoneme * startWord associatedData.append({'w_id' : w_id, 'p_id': p_id, 'word' : word, 'phoneme' : phoneme, 'someOp': startWord}) # Gather associated data for later use: print(associatedData)

1条回答

网友

1楼 · 发布于 2024-10-02 08:17:08

为每个单词寻找所有可能的音位是不可能的。所做的工作比需要做的要多。对于任何数量的words和phonemes，这种方法将始终存在len(words) * len(phonemes)操作。矢量化可以加快速度，但最好减少复杂性本身。在

对于每个单词，尽量只看几个音素候选者。一种解决方案是在周围保留一个指向当前音素的指针。对于每个新词，在匹配音素范围内迭代（本地，就在当前音素指针的周围）。在

伪代码解决方案：

# skip if already sorted
words = sorted(words, key=lambda x:x["sampleStart"])
phonemes = sorted(phonemes, key=lambda x:x["sampleStart"])

phoneme_idx = 0
for w in words:

    # go back until the earliest relevant phoneme
    while endtime(phonemes[phoneme_idx]) < starttime(w):
         phoneme_idx -= 1

    # evaluate all phonemes in range
    while endtime(phonemes[phoneme_idx]) <= starttime(w):
         # match and compute
         evavalute(phonemes[phoneme_idx], w)
         phoneme_idx += 1

相关问题更多 >

编程相关推荐

热门问题

热门文章