Python tensorflow使用多个数组功能创建tfrecord

2024-06-28 05:18:38 发布

您现在位置:Python中文网/ 问答频道 /正文

我按照TensorFlowdocs从三个NumPy数组生成一个tf.record,但是,我在尝试序列化数据时出错。我希望得到的tfrecord包含三个特性

import numpy as np
import pandas as pd
# some random data
x = np.random.randn(85)
y = np.random.randn(85,2128)
z = np.random.choice(range(10),(85,155))

def _float_feature(value):
    """Returns a float_list from a float / double."""
    return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))

def _int64_feature(value):
    """Returns an int64_list from a bool / enum / int / uint."""
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def serialize_example(feature0, feature1, feature2):
    """
    Creates a tf.Example message ready to be written to a file.
    """
    # Create a dictionary mapping the feature name to the tf.Example-compatible
    # data type.
    feature = {
      'feature0': _float_feature(feature0),
      'feature1': _float_feature(feature1),
      'feature2': _int64_feature(feature2)
    }
    # Create a Features message using tf.train.Example.
    example_proto = tf.train.Example(features=tf.train.Features(feature=feature))
    return example_proto.SerializeToString()

features_dataset = tf.data.Dataset.from_tensor_slices((x, y, z))

features_dataset

<TensorSliceDataset shapes: ((), (2128,), (155,)), types: (tf.float64, tf.float32, tf.int64)>

for f0,f1,f2 in features_dataset.take(1):
    print(f0)
    print(f1)
    print(f2)
def tf_serialize_example(f0,f1,f2):
  tf_string = tf.py_function(
    serialize_example,
    (f0,f1,f2),  # pass these args to the above function.
    tf.string)      # the return type is `tf.string`.
  return tf.reshape(tf_string, ()) # The result is a scalar

然而,当试图运行tf_serialize_example(f0,f1,f2)

我得到一个错误:

InvalidArgumentError: TypeError: <tf.Tensor: shape=(2128,), dtype=float32, numpy=
array([-0.5435242 ,  0.97947884, -0.74457455, ...,  has type tensorflow.python.framework.ops.EagerTensor, but expected one of: int, long, float
Traceback (most recent call last):

我想原因是,我的功能是数组而不是数字。我如何使这段代码适用于特性,这些特性是数组而不是数字


Tags: returnvalueexampletfdefnptrainrandom
1条回答
网友
1楼 · 发布于 2024-06-28 05:18:38

好吧,我现在抽时间仔细看看。我注意到features_datasettf_serialize_example的用法来自tensorflow webppage的教程。我不知道这种方法的优点是什么以及如何解决这个问题

但是这里有一个工作流程应该适用于您的代码(我重新打开了生成的tfrecords文件,它们很好)

import numpy as np
import tensorflow as tf

# some random data
x = np.random.randn(85)
y = np.random.randn(85,2128)
z = np.random.choice(range(10),(85,155))

def _float_feature(value):
    """Returns a float_list from a float / double."""
    return tf.train.Feature(float_list=tf.train.FloatList(value=value.flatten()))

def _int64_feature(value):
    """Returns an int64_list from a bool / enum / int / uint."""

    return tf.train.Feature(int64_list=tf.train.Int64List(value=value.flatten()))

def serialize_example(feature0, feature1, feature2):
    """
    Creates a tf.Example message ready to be written to a file.
    """
    # Create a dictionary mapping the feature name to the tf.Example-compatible
    # data type.
    feature = {
      'feature0': _float_feature(feature0),
      'feature1': _float_feature(feature1),
      'feature2': _int64_feature(feature2)
    }
    # Create a Features message using tf.train.Example.
    return tf.train.Example(features=tf.train.Features(feature=feature))


writer = tf.python_io.TFRecordWriter('TEST.tfrecords')
example = serialize_example(x,y,z)
writer.write(example.SerializeToString())
writer.close()

这段代码的主要区别在于,您将numpy数组而不是tensorflow张量馈送到serialize_example。希望这有帮助

相关问题 更多 >