在python迭代器中,根据条件选择其他迭代器

2024-06-02 22:44:32 发布

您现在位置:Python中文网/ 问答频道 /正文

在python中,我有一个迭代器返回一个固定范围[0, N]内的无限索引字符串,称为Sampler。实际上我有一个列表,它们所做的只是返回范围[0, N_0], [N_0, N_1], ..., [N_{n-1}, N_n].内的索引

我现在想做的是首先根据迭代器范围的长度选择其中一个迭代器,因此我有一个weights列表[N_0, N_1 - N_0, ...],我选择其中一个:

    iterator_idx = random.choices(range(len(weights)), weights=weights/weights.sum())[0]

接下来,我要做的是创建一个迭代器,它随机选择一个迭代器并选择一批M样本

class BatchSampler:
    def __init__(self, M):
        self.M = M
        self.weights = [weight_list]

        self.samplers = [list_of_iterators]
        ]
        self._batch_samplers = [
            self.batch_sampler(sampler) for sampler in self.samplers
        ]

    def batch_sampler(self, sampler):
        batch = []
        for batch_idx in sampler:
            batch.append(batch_idx)
            if len(batch) == self.M:
                yield batch

        if len(batch) > 0:
            yield batch

    def __iter__(self):
        # First select one of the datasets.
        iterator_idx = random.choices(
            range(len(self.weights)), weights=self.weights / self.weights.sum()
        )[0]
        return self._batch_samplers[iterator_idx]

问题是iter()似乎只被调用一次,因此只选择了第一次iterator_idx。显然这是错误的。。。解决这个问题的办法是什么?

当pytorch中有多个数据集,但只希望从其中一个数据集中采样批次时,可能会出现这种情况


Tags: self列表lendefbatchrangerandomsamplers
1条回答
网友
1楼 · 发布于 2024-06-02 22:44:32

在我看来,您似乎想要定义自己的容器类型。
我将尝试提供一些标准方法的示例
(希望不会遗漏太多细节)
您应该能够重用这些简单示例中的一个,
进入你自己的班级


使用just _ugetItem(支持索引和循环):

object.__getitem__

Called to implement evaluation of self[key].


class MyContainer:
  def __init__(self, sequence):
    self.elements = sequence  # Just something to work with.
  
  def __getitem__(self, key):
    # If we're delegating to sequences like built-in list, 
    # invalid indices are handled automatically by them 
    # (throwing IndexError, as per the documentation).
    return self.elements[key]

t = (1, 2, 'a', 'b')
c = MyContainer(t)
elems = [e for e in c]
assert elems == [1, 2, 'a', 'b']
assert c[1:-1] == t[1:-1] == (2, 'a')


使用迭代器协议:

object.__iter__

object.__iter__(self)
This method is called when an iterator is required for a container. This method should return a new iterator object that can iterate over all the objects in the container. For mappings, it should iterate over the keys of the container.
Iterator objects also need to implement this method; they are required to return themselves. For more information on iterator objects, see Iterator Types.

Iterator Types

container.__iter__()
Return an iterator object. The object is required to support the iterator protocol described below.

The iterator objects themselves are required to support the following two methods, which together form the iterator protocol:

iterator.__iter__()
Return the iterator object itself. This is required to allow both containers and iterators to be used with the for and in statements.

iterator.__next__()
Return the next item from the container. If there are no further items, raise the StopIteration exception.

Once an iterator's __next__() method raises StopIteration, it must continue to do so on subsequent calls.


class MyContainer:
  class Iter:
    def __init__(self, container):
      self.cont = container
      self.pos = 0
      self.len = len(container.elements)
    
    def __iter__(self): return self
    def __next__(self):
      if self.pos == self.len: raise StopIteration
      curElem = self.cont.elements[self.pos]
      self.pos += 1
      return curElem
  
  def __init__(self, sequence):
    self.elements = sequence  # Just something to work with.
  
  def __iter__(self):
    return MyContainer.Iter(self)

t = (1, 2, 'a', 'b')
c = MyContainer(t)
elems = [e for e in c]
assert elems == [1, 2, 'a', 'b']


使用发电机:

Generator Types

Python's generators provide a convenient way to implement the iterator protocol. If a container object's iter() method is implemented as a generator, it will automatically return an iterator object (technically, a generator object) supplying the iter() and next() methods.

generator

A function which returns a generator iterator. It looks like a normal function except that it contains yield expressions for producing a series of values usable in a for-loop or that can be retrieved one at a time with the next() function.
Usually refers to a generator function, but may refer to a generator iterator in some contexts.

generator iterator

An object created by a generator function.

6.2.9. Yield expressions

Using a yield expression in a function's body causes that function to be a generator


class MyContainer:
  def __init__(self, sequence):
    self.elements = sequence  # Just something to work with.
  
  def __iter__(self):
    for e in self.elements: yield e

t = (1, 2, 'a', 'b')
c = MyContainer(t)
elems = [e for e in c]
assert elems == [1, 2, 'a', 'b']

相关问题 更多 >