张量流根据概率分布整数

1条回答

网友

1楼 · 发布于 2024-06-24 12:32:26

这里有几个问题需要考虑。你知道吗

如果在GPU上运行代码，它将永远无法工作，因为GPU不是为存储而设计的，而是为快速计算而设计的，因此GPU上的空间小于CPU。但是，这段代码也可能在CPU上产生内存错误，就像在我的机器上一样。所以我们首先要克服这个问题。你知道吗

克服CPU内存错误：

产生MemoryError的行是行1本身：

    In [1]: frequency = np.random.choice([i for i in range (10**7)],16**10,p=[0.0000
   ...: 001 for i in range(10**7)])
   ...: 
                                     -
MemoryError                               Traceback (most recent call last)

原因是第1行的输出的大小不是10**7，而是16**10。因为这就是导致MemoryError的原因，所以我们的目标应该是永远不要创建一个如此大的列表。你知道吗

为此，我们将样本的大小减少一个因子，并在块上循环factor次，以便它可以存储。在我的机器上，因子1000000起作用。一旦我们创建了样本，我们就使用Counter将其转换为频率字典。优点是，我们知道频率字典在转换为列表或numpy数组时，永远不会超过10**7的大小，这不会产生内存错误。你知道吗

由于有些元素可能每次都不在采样数组中，因此我们将不直接将计数器字典转换为列表，而是在上一次迭代中使用字典更新此字典，以保留特定元素的频率。你知道吗

完成整个循环后，我们将创建的字典转换为列表。我添加了一个progressbar来跟踪进度，因为计算可能需要很多时间。另外，在您的特定情况下，不需要将参数p添加到np.random.choice()函数中，因为分布是一致的。你知道吗

import numpy as np
import tensorflow as tf

from click import progressbar
from collections import Counter

def large_uniform_sample_frequencies(factor=1000000, total_elements=10**7, sample_size=16**10):
    # Initialising progressbar
    bar = range(factor)

    # Initialise an empty dictionary which 
    # will be updated in each iteration
    counter_dict = {}

    for iteration in bar:
        # Generate a random sample of size (16 ** 10) / factor
        frequency = np.random.choice([i for i in range (total_elements)],
            sample_size / factor)

        # Update the frequency dictionary
        new_counter = Counter(frequency)
        counter_dict.update(new_counter)

    return np.fromiter(counter_dict.values(), dtype=np.float32)

使用tensorflow gpu:

正如您所提到的tensorflow-gpu，我可以假设您要么想使用tensorflow-gpu摆脱MemoryError，要么在使用GPU时将其与tensorflow-gpu一起运行。你知道吗

为了解决MemoryError，您可以尝试使用tf.multinomial()函数，其效果与np.random.choice()和shown here相同，但它不太可能有助于克服这个问题，即存储特定大小的数据，而不执行其他计算。你知道吗

例如，如果您想在训练某个模型时运行它，您可以使用分布式Tensorflow将计算图的这部分作为PS任务放在CPU上，方法是使用上面给出的代码。以下是最终代码：

# Mention the devices for PS and worker tasks
ps_dev = '/cpu:0'
worker_dev = '/gpu:0'

# Toggle True to place computation on CPU 
# and False to place it on the least loaded GPU
is_ps_task = True

# Set device for a PS task
if (is_ps_task):
    device_setter = tf.train.replica_device_setter(worker_device=worker_dev,
        ps_device=ps_dev, 
        ps_tasks=1)

# Allocate the computation to CPU
with tf.device(device_setter):
    freqs = large_uniform_sample_frequencies()

相关问题更多 >

编程相关推荐

热门问题

热门文章