动态更改groupby键

2024-06-25 22:32:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要把一个概率排序表分成几组。第一组包含(0.5,1),第二组(0.25,0.5)等概率

我已经编写了一些代码,将一个包含两个小于1的幂的列表拆分为两个列表:一个列表成员大于0.5,另一个列表成员(原始)小于0.5。你知道吗

from itertools import groupby
from operator import itemgetter
import doctest
N= 10 

twos = [2**(-(i+1)) for i in range(0,N)]

def split_by_prob(items,cutoff):
    """
    (list of double) -> list of (lists) of double
    Splits a set into subsets based on probability
    >>> split_by_prob(twos, 0.5)
    [[0.5], [ 0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]]
    """
    groups = []
    keys = []
    for k,g in it.groupby(enumerate(items), lambda (j, x): x<cutoff):
        groups.append((map(itemgetter(1),g)))
    return groups

从命令行调用此代码完全可以执行以下操作:

>>> g = split_into_groups(twos,0.5)
>>> g
[[0.5], [0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]]

我的问题是:如何更改每次迭代的截止值?也就是说,如果我给函数传递了一个截止点列表(例如cutoffs = [0.5, 0.125, 0.0625]),我会得到一个列表列表,每个列表都将原始列表的相应成员分组到正确的类别中。在本例中,返回的组类似于[[0.5],[0.25,0125],[0.0625],[0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]]


Tags: of代码infromimport列表forby
2条回答

如果我理解正确的话,您就可以使用x < i为cutoffs中的每个I遍历一个cutoffs列表。你知道吗

cutoffs = [0.5, 0.125, 0.0625]
def split_by_prob(items,cutoffs):
    """
    (list of double) -> list of (lists) of double
    Splits a set into subsets based on probability
   # >>> split_by_prob(twos, 0.5)
    [[0.5], [ 0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]]
    """
    groups = []
    keys = []

    for i in cutoffs:
        for k,g in groupby(enumerate(items), lambda (j, x): x < i):
            groups.append((map(itemgetter(1),g)))
    return groups

print split_by_prob(twos, cutoffs)


 [0.5], [0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625], [0.5, 0.25, 0.125], [0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625], [0.5, 0.25, 0.125, 0.0625], [0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]

我已经知道我需要做什么,下面是完整的代码。但我不确定它有多有效或有多好:

import numpy as np
from itertools import groupby
from operator import itemgetter
import doctest
N= 10 

twos = [2**(-(i+1)) for i in range(0,N)]
cutoffs = [0.5, 0.125, 0.03125]

def split_by_prob(items,cutoff,groups):
    """
    (list of double) -> list of (lists) of double
    Splits a set into subsets based on probability
    >>> split_by_prob(twos, 0.5)
    [[0.5], [ 0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]]
    """
    for k,g in groupby(enumerate(items), lambda (j, x): x<cutoff):
        groups.append((map(itemgetter(1),g)))
    return groups

def split_into_groups(items, cutoffs):
    """
    (list of double) -> list of (lists) of double
    Splits a set into subsets based on probability
    >>> split_by_prob(twos, cutoffs)
    [[0.5], [0.25, 0.125], [0.0625, 0.03125], [0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]]
    """
    groups = items
    final = []
    for i in cutoffs:
        groups = split_by_prob(groups,i,[])
        final.append(groups[0])
        groups = groups.pop()
    final.append(groups)
    return final

相关问题 更多 >