TensorFlow:如何记录GPU内存（VRAM）利用率？

1条回答

网友

1楼 · 发布于 2024-05-21 14:36:05

更新，可以使用TensorFlow ops查询分配器：

# maximum across all sessions and .run calls so far
sess.run(tf.contrib.memory_stats.MaxBytesInUse())
# current usage
sess.run(tf.contrib.memory_stats.BytesInUse())

此外，您还可以通过查看RunMetadata，获得有关session.run调用的详细信息，包括在run调用期间正在分配的所有内存。像这样的东西

run_metadata = tf.RunMetadata()
sess.run(c, options=tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE, output_partition_graphs=True), run_metadata=run_metadata)

下面是一个端到端的示例--以列向量、行向量为例，将它们相加，得到加法矩阵：

import tensorflow as tf

no_opt = tf.OptimizerOptions(opt_level=tf.OptimizerOptions.L0,
                             do_common_subexpression_elimination=False,
                             do_function_inlining=False,
                             do_constant_folding=False)
config = tf.ConfigProto(graph_options=tf.GraphOptions(optimizer_options=no_opt),
                        log_device_placement=True, allow_soft_placement=False,
                        device_count={"CPU": 3},
                        inter_op_parallelism_threads=3,
                        intra_op_parallelism_threads=1)
sess = tf.Session(config=config)

with tf.device("cpu:0"):
    a = tf.ones((13, 1))
with tf.device("cpu:1"):
    b = tf.ones((1, 13))
with tf.device("cpu:2"):
    c = a+b

sess = tf.Session(config=config)
run_metadata = tf.RunMetadata()
sess.run(c, options=tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE, output_partition_graphs=True), run_metadata=run_metadata)
with open("/tmp/run2.txt", "w") as out:
  out.write(str(run_metadata))

如果打开run.txt，您将看到如下消息：

  node_name: "ones"

      allocation_description {
        requested_bytes: 52
        allocator_name: "cpu"
        ptr: 4322108320
      }
  ....

  node_name: "ones_1"

      allocation_description {
        requested_bytes: 52
        allocator_name: "cpu"
        ptr: 4322092992
      }
  ...
  node_name: "add"
      allocation_description {
        requested_bytes: 676
        allocator_name: "cpu"
        ptr: 4492163840

因此，这里可以看到a和b分别分配了52个字节（13*4），结果分配了676个字节。

相关问题更多 >

编程相关推荐

热门问题

热门文章

TensorFlow:如何记录GPU内存（VRAM）利用率？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >