访问自定义操作的输入值时出现分段错误

REGISTER_OP("Interface") .Input("pointer_to_grid: int32") .Output("current_grid_data: float32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { shape_inference::ShapeHandle input_shape; TF_RETURN_IF_ERROR(c->WithRank(c->input(0), 0, &input_shape)); // allow only a 1D pointer address stored in an integer return Status::OK(); }); class InterfaceGPU : public OpKernel { public: explicit InterfaceGPU(OpKernelConstruction* context) : OpKernel(context) {} void Compute(OpKernelContext* context) override { // Grab the input tensor const Tensor& input_tensor = context->input(0); const auto input = input_tensor.flat<int32>(); printf("This works %d \n", input); printf("This does not %d \n", input(0)); //Segementation fault is here //... } }; REGISTER_KERNEL_BUILDER(Name("GridPointerInterface").Device(DEVICE_GPU), InterfaceGPU);

import tensorflow as tf import numpy as np import sys op_interface = tf.load_op_library('~/tensorflow/bazel-bin/tensorflow/core/user_ops/interface.so') with tf.device("/gpu:0"): with tf.Session() as sess: sess.run(op_interface.interface_gpu(12))

1条回答

网友

1楼 · 发布于 2024-10-03 04:27:45

这是预期的，因为您正试图从CPU访问存储在GPU上的值（这样您就可以打印它）。你知道吗

在GPU上操纵值的方法是通过eigen。如果您查看tensorflow中其他内核的实现，您将看到output.flat<float32>().device(ctx->eigen_device<GPUDevice>()) = input.flat<float32>() + ....之类的代码。这告诉eigen为您创建一个cuda内核。你知道吗

如果您想直接操作GPU上的值，您需要同步GPU流并将其复制到CPU内存，这相当复杂。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章