从包含给定最大字数的字符串列表创建子列表

q1 = ['blended e learning forumin planning', 'difficulties of learning as forigen language', 'difficulties of grammar'] q2 = ['difficulties of grammar', 'students difficulties in grammar', 'difficulties of english grammar']

qq = ['blended e learning forumin planning', 'difficulties of learning as forigen language', 'difficulties of grammar', 'students difficulties in grammar', 'difficulties of english grammar'] psz = 0 pi = 0 msz = 16 subqq = list() qq_i = list() for i in range(len(qq)): csz=psz+len(qq[i].split()) if (csz>msz): subqq.append(qq_i.copy()) qq_i.clear() qq_i.append(qq[i-1]) qq_i.append(qq[i]) psz = 0 else: qq_i.append(qq[i]) psz += len(qq[i].split()) subqq.append(qq_i)

3条回答

网友

1楼 · 编辑于 2024-09-27 21:28:48

由于其他人已经提供了有效的解决方案，下面是另一个有趣的方法，它是通过在每个qq元素和相应的累积字数之间的映射方案实现的

首先，创建映射dict：

qq_map = {q: len(" ".join(qq[:n+1]).split()) for n, q in enumerate(qq)}

# {'blended e learning forumin planning': 5, 'difficulties of learning as forigen language': 11,
# 'difficulties of grammar': 14, 'students difficulties in grammar': 18,
# 'difficulties of english grammar': 22}

然后使用映射信息构建分组列表：

qq = [[q for q in qq if qq_map[q] in range(i*16, (i+1)*16)] /
     for i in range(-(-qq_map[qq[-1]] // 16))]

# [['blended e learning forumin planning', 'difficulties of learning as forigen language', 'difficulties of grammar'], 
# ['students difficulties in grammar', 'difficulties of english grammar']]

Note: -(-qq_map[qq[-1]] // 16) is an equivalent to math.ceil(qq[-1] / 16). You can replace it if you'd like a more concise and less 'arithmetic' expression.

最后，再次处理列表，以便将每个组的最后一个字符串推送到下一个组中（当然第一个字符串除外）：

qq = [[qq[i-1][-1]] + qq[i] if i != 0 else qq[i] for i in range(len(qq))]

# [['blended e learning forumin planning', 'difficulties of learning as forigen language', 'difficulties of grammar'], 
# ['difficulties of grammar', 'students difficulties in grammar', 'difficulties of english grammar']]

网友
2楼 · 编辑于 2024-09-27 21:28:48

这就是我想到的。这是一个类似于你的算法和SorousH Bakhtiary的答案，但应该没有字数错误，我认为它更容易阅读
如果我们使用前一个子列表中的最后一个短语开始新的子列表，并且在不突破单词限制的情况下无法添加下一个短语，那么这也会引发一个错误。如果有两个连续的短语具有>；8个字-如果你能确定这永远不会发生，那么你可以省略这一部分
def count_words(phrase): return len(phrase.split()) def sublists_with_max_words(main_list, max_words=16): output_sublists = [] current_sublist = [] current_sublist_words = 0 for phrase in main_list: words_in_phrase = count_words(phrase) if (current_sublist_words + words_in_phrase) > max_words: # If we cannot add the phrase to the sublist without breaking # the word limit, then add the sublist to the output output_sublists.append(current_sublist) # Start a new sublist with the last phrase we added last_phrase = current_sublist[-1] current_sublist = [last_phrase] current_sublist_words = count_words(last_phrase) # If we cannot add the phrase to the new sublist either, then raise # an exception as we cannot continue without breaking the word limit if (current_sublist_words + words_in_phrase) > max_words: raise ValueError( f"Cannot add '{phrase}' ({words_in_phrase} words) to a new" f" sublist with {current_sublist_words} words" ) # Add the current phrase to the sublist current_sublist.append(phrase) current_sublist_words += words_in_phrase # At the end of the loop, add the working sublist to the output output_sublists.append(current_sublist) return output_sublists print(sublists_with_max_words(qq))

网友
3楼 · 编辑于 2024-09-27 21:28:48

我的与你的相似，算法基本相同，但我认为这应该运行得稍微快一点：

def fn(lst, n):
    word_count = 0
    res = []
    temp_lst = []

    for item in lst:
        len_current_item = len(item.split())
        word_count += len_current_item

        if word_count < n:
            temp_lst.append(item)

        else:
            res.append(temp_lst)
            last_item = res[-1][-1]
            temp_lst = [last_item, item]
            word_count = len_current_item + len(last_item.split())

    res.append(temp_lst)

    # Checking for last item's lenght as Phydeaux pointed out in comments.
    if word_count > n:
        res.append([temp_lst.pop()])

    return res

输出：

['blended e learning forumin planning', 'difficulties of learning as forigen language', 'difficulties of grammar']
['difficulties of grammar', 'students difficulties in grammar', 'difficulties of english grammar']

我尽量避免复制和清除，以及一些小的改动

相关问题更多 >

编程相关推荐

热门问题

热门文章