PyTorch变换的TensorFlow等价物。Normalize()

2024-09-30 20:35:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图推断最初在PyTorch中构建的TFLite模型。我一直在沿着PyTorch implementation的路线进行跟踪,并且必须沿着RGB通道对图像进行预处理。我发现transforms.Normalize()最接近的TensorFlow等价物是tf.image.per_image_standardization()documentation)。虽然这是一个非常好的匹配,但是tf.image.per_image_standardization()通过跨通道获取平均值和std并将其应用于它们来实现这一点。下面是它们从here开始的完整实现

def per_image_standardization(image):
  """Linearly scales `image` to have zero mean and unit norm.
  This op computes `(x - mean) / adjusted_stddev`, where `mean` is the average
  of all values in image, and
  `adjusted_stddev = max(stddev, 1.0/sqrt(image.NumElements()))`.
  `stddev` is the standard deviation of all values in `image`. It is capped
  away from zero to protect against division by 0 when handling uniform images.
  Args:
    image: 3-D tensor of shape `[height, width, channels]`.
  Returns:
    The standardized image with same shape as `image`.
  Raises:
    ValueError: if the shape of 'image' is incompatible with this function.
  """
  image = ops.convert_to_tensor(image, name='image')
  _Check3DImage(image, require_static=False)
  num_pixels = math_ops.reduce_prod(array_ops.shape(image))

  image = math_ops.cast(image, dtype=dtypes.float32)
  image_mean = math_ops.reduce_mean(image)

  variance = (math_ops.reduce_mean(math_ops.square(image)) -
              math_ops.square(image_mean))
  variance = gen_nn_ops.relu(variance)
  stddev = math_ops.sqrt(variance)

  # Apply a minimum normalization that protects us against uniform images.
  min_stddev = math_ops.rsqrt(math_ops.cast(num_pixels, dtypes.float32))
  pixel_value_scale = math_ops.maximum(stddev, min_stddev)
  pixel_value_offset = image_mean

  image = math_ops.subtract(image, pixel_value_offset)
  image = math_ops.div(image, pixel_value_scale)
  return image

而PyTorch的transforms.Normalize()允许我们提及每个通道应用的平均值和std,如下所示

# transformation
    pose_transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406],
                             std=[0.229, 0.224, 0.225]),
    ])

在TensorFlow 2.x中获得此功能的方法是什么

编辑: 我创建了一个快速的错误,似乎通过定义这样一个函数来解决这个问题:

def normalize_image(image, mean, std):
    for channel in range(3):
        image[:,:,channel] = (image[:,:,channel] - mean[channel])/std[channel]
    
    return image

我不确定这是否有效,但似乎能完成工作。在输入到模型之前,我仍然需要将输出转换为张量


Tags: ofimageisvaluechannelmathpytorchmean
1条回答
网友
1楼 · 发布于 2024-09-30 20:35:24

你提到的解决办法似乎还可以。但是,在处理数据管道中的大型数据集(generatortf.data)时,使用for...loop计算单个图像的每个RGB通道的标准化可能会有点问题。但无论如何都没关系。这是您的方法的演示,稍后我们将提供两种可能的替代方案,它们可能很容易为您工作

from PIL import Image 
from matplotlib.pyplot import imshow, subplot, title, hist

# load image (RGB)
img = Image.open('/content/9.jpg')

def normalize_image(image, mean, std):
    for channel in range(3):
        image[:,:,channel] = (image[:,:,channel] - mean[channel]) / std[channel]
    return image

OP_approach = normalize_image(np.array(img) / 255.0, 
                            mean=[0.485, 0.456, 0.406], 
                            std=[0.229, 0.224, 0.225])

现在,让我们在后面观察变换属性

plt.figure(figsize=(25,10))
subplot(121); imshow(OP_approach); title(f'Normalized Image \n min-px: \
    {OP_approach.min()} \n max-pix: {OP_approach.max()}')
subplot(122); hist(OP_approach.ravel(), bins=50, density=True); \ 
                                    title('Histogram - pixel distribution')

enter image description here

归一化后的最小和最大像素范围分别为(-2.11790393013100432.6399999999999997

选择2

我们可以使用tf. keras...Normalization预处理层来做同样的事情。它需要两个重要的参数meanvariance(平方std

from tensorflow.keras.experimental.preprocessing import Normalization

input_data = np.array(img)/255
layer = Normalization(mean=[0.485, 0.456, 0.406], 
                      variance=[np.square(0.299), 
                                np.square(0.224), 
                                np.square(0.225)])

plt.figure(figsize=(25,10))
subplot(121); imshow(layer(input_data).numpy()); title(f'Normalized Image \n min-px: \
   {layer(input_data).numpy().min()} \n max-pix: {layer(input_data).numpy().max()}')
subplot(122); hist(layer(input_data).numpy().ravel(), bins=50, density=True);\
   title('Histogram - pixel distribution')

enter image description here

归一化后的最小和最大像素范围分别为(-2.03571442.64

选择3

这更像是减去平均值mean,然后除以平均值std

norm_img = ((tf.cast(np.array(img), tf.float32) / 255.0) - 0.449) / 0.226

plt.figure(figsize=(25,10))
subplot(121); imshow(norm_img.numpy()); title(f'Normalized Image \n min-px: \
{norm_img.numpy().min()} \n max-pix: {norm_img.numpy().max()}')
subplot(122); hist(norm_img.numpy().ravel(), bins=50, density=True); \
title('Histogram - pixel distribution')

enter image description here

归一化后的最小和最大像素范围分别为(-1.98672572.4380531)。最后,如果我们与pytorch方法相比,这些方法之间没有太大的差异

import torchvision.transforms as transforms

transform_norm = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                            std=[0.229, 0.224, 0.225]),
])
norm_pt = transform_norm(img)

plt.figure(figsize=(25,10))
subplot(121); imshow(np.array(norm_pt).transpose(1, 2, 0));\
  title(f'Normalized Image \n min-px: \
  {np.array(norm_pt).min()} \n max-pix: {np.array(norm_pt).max()}')
subplot(122); hist(np.array(norm_pt).ravel(), bins=50, density=True); \
  title('Histogram - pixel distribution')

enter image description here

归一化后的最小和最大像素范围分别为(-2.1179042.64

相关问题 更多 >