将SSD对象检测模型转换为TFLite,并将其从浮点量化为EdgeTPU的uint8

2024-09-29 23:28:15 发布

您现在位置:Python中文网/ 问答频道 /正文

我在将SSD对象检测模型转换为EdgeTPU的uint8 TFLite时遇到问题

据我所知,我一直在不同的论坛中搜索,堆栈溢出线程和github问题,我认为我遵循了正确的步骤。我的jupyter笔记本一定出了什么问题,因为我无法实现我的建议

我和你们分享我在Jupyter笔记本上解释的步骤。我想会更清楚

#!/usr/bin/env python
# coding: utf-8

设立

此步骤是克隆存储库。如果您以前做过一次,可以省略此步骤。

import os
import pathlib

# Clone the tensorflow models repository if it doesn't already exist
if "models" in pathlib.Path.cwd().parts:
  while "models" in pathlib.Path.cwd().parts:
    os.chdir('..')
elif not pathlib.Path('models').exists():
  !git clone --depth 1 https://github.com/tensorflow/models

进口

所需步骤:这只是为了使导入

import matplotlib
import matplotlib.pyplot as plt
import pathlib
import os
import random
import io
import imageio
import glob
import scipy.misc
import numpy as np
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont
from IPython.display import display, Javascript
from IPython.display import Image as IPyImage

import tensorflow as tf
import tensorflow_datasets as tfds


from object_detection.utils import label_map_util
from object_detection.utils import config_util
from object_detection.utils import visualization_utils as viz_utils
#from object_detection.utils import colab_utils
from object_detection.utils import config_util
from object_detection.builders import model_builder

%matplotlib inline

下载友好的模型

对于tflite,建议使用SSD网络。 我已经下载了以下模型,它是关于“目标检测”。它适用于320x320图像。
# Download the checkpoint and put it into models/research/object_detection/test_data/

!wget http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz
!tar -xf ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz
!if [ -d "models/research/object_detection/test_data/checkpoint" ]; then rm -Rf models/research/object_detection/test_data/checkpoint; fi
!mkdir models/research/object_detection/test_data/checkpoint
!mv ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/checkpoint models/research/object_detection/test_data/

用于为每个框添加正确标签的字符串列表

PATH_TO_LABELS = '/home/jose/codeWorkspace-2.4.1/tf_2.4.1/models/research/object_detection/data/mscoco_label_map.pbtxt'
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

导出并使用TFLite运行

模型转换

在此步骤中,我将pb保存的模型转换为.tflite

!tflite_convert --saved_model_dir=/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/saved_model --output_file=/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model.tflite

模型量化(从浮点到uint8)

一旦模型被转换,我需要量化它。原始模型选取一个浮点作为张量输入。因为我想在边缘TPU上运行它,所以我需要输入和输出张量为uint8。

生成校准数据集。

def representative_dataset_gen():
    folder = "/home/jose/codeWorkspace-2.4.1/tf_2.4.1/images_ssd_mb2_2"
    image_size = 320
    raw_test_data = []

    files = glob.glob(folder+'/*.jpeg')
    for file in files:
        image = Image.open(file)
        image = image.convert("RGB")
        image = image.resize((image_size, image_size))
        #Quantizing the image between -1,1;
        image = (2.0 / 255.0) * np.float32(image) - 1.0
        #image = np.asarray(image).astype(np.float32)
        image = image[np.newaxis,:,:,:]
        raw_test_data.append(image)

    for data in raw_test_data:
        yield [data]

(不要运行这个)。这是上述步骤,但具有随机值

如果没有数据集,还可以引入随机生成的值,就像它是图像一样。这是我用来这样做的代码:
####THIS IS A RANDOM-GENERATED DATASET#### 
def representative_dataset_gen():
    for _ in range(320):
      data = np.random.rand(1, 320, 320, 3)
      yield [data.astype(np.float32)]

调用模型转换

converter = tf.lite.TFLiteConverter.from_saved_model('/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/saved_model')
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8, tf.lite.OpsSet.SELECT_TF_OPS]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
converter.allow_custom_ops = True
converter.representative_dataset = representative_dataset_gen
tflite_model = converter.convert()

警告:

转换步骤返回一个警告

WARNING:absl:For model inputs containing unsupported operations which cannot be quantized, the inference_input_type attribute will default to the original type. WARNING:absl:For model outputs containing unsupported operations which cannot be quantized, the inference_output_type attribute will default to the original type.

这让我觉得转换是不正确的

保存模型

with open('/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite'.format('/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/saved_model'), 'wb') as w:
    w.write(tflite_model)
print("tflite convert complete! - {}/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite".format('/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/saved_model'))

测验

测试1:获取TensorFlow版本

我读到建议每晚使用。所以在我的例子中,版本是2.6.0

print(tf.version.VERSION)

测试2:获取输入/输出张量细节

interpreter = tf.lite.Interpreter(model_path="/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite")
interpreter.allocate_tensors()

print(interpreter.get_input_details())
print("@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@")
print(interpreter.get_output_details())

测试2结果:

我得到以下信息:

[{'name': 'serving_default_input:0', 'index': 0, 'shape': array([ 1, 320, 320, 3], dtype=int32), 'shape_signature': array([ 1, 320, 320, 3], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.007843137718737125, 127), 'quantization_parameters': {'scales': array([0.00784314], dtype=float32), 'zero_points': array([127], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}] @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

[{'name': 'StatefulPartitionedCall:31', 'index': 377, 'shape': array([ 1, 10, 4], dtype=int32), 'shape_signature': array([ 1, 10, 4], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:32', 'index': 378, 'shape': array([ 1, 10], dtype=int32), 'shape_signature': array([ 1, 10], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:33', 'index': 379, 'shape': array([ 1, 10], dtype=int32), 'shape_signature': array([ 1, 10], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:34', 'index': 380, 'shape': array([1], dtype=int32), 'shape_signature': array([1], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

所以,我认为它没有正确地量化它

将生成的模型转换为EdgeTPU

!edgetpu_compiler -s /home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite

jose@jose-VirtualBox:~/python-envs$ edgetpu_compiler -s /home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite Edge TPU Compiler version 15.0.340273435

Model compiled successfully in 1136 ms.

Input model: /home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite Input size: 3.70MiB Output model: model_full_integer_quant_edgetpu.tflite Output size: 4.21MiB On-chip memory used for caching model parameters: 3.42MiB On-chip memory remaining for caching model parameters: 4.31MiB Off-chip memory used for streaming uncached model parameters: 0.00B Number of Edge TPU subgraphs: 1 Total number of operations: 162 Operation log: model_full_integer_quant_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs. Number of operations that will run on Edge TPU: 112 Number of operations that will run on CPU: 50

Operator Count Status

LOGISTIC 1 Operation is otherwise supported, but not mapped due to some unspecified limitation DEPTHWISE_CONV_2D 14 More than one subgraph is not supported DEPTHWISE_CONV_2D 37 Mapped to Edge TPU QUANTIZE 1 Mapped to Edge TPU QUANTIZE 4 Operation is otherwise supported, but not mapped due to some unspecified limitation CONV_2D
58 Mapped to Edge TPU CONV_2D 14
More than one subgraph is not supported DEQUANTIZE
1 Operation is working on an unsupported data type DEQUANTIZE 1 Operation is otherwise supported, but not mapped due to some unspecified limitation CUSTOM 1
Operation is working on an unsupported data type ADD
2 More than one subgraph is not supported ADD
10 Mapped to Edge TPU CONCATENATION 1
Operation is otherwise supported, but not mapped due to some unspecified limitation CONCATENATION 1 More than one subgraph is not supported RESHAPE 2
Operation is otherwise supported, but not mapped due to some unspecified limitation RESHAPE 6
Mapped to Edge TPU RESHAPE 4 More than one subgraph is not supported PACK 4
Tensor has unsupported rank (up to 3 innermost dimensions mapped)

我准备的jupyter笔记本可以在以下链接上找到:https://github.com/jagumiel/Artificial-Intelligence/blob/main/tensorflow-scripts/Step-by-step-explaining-problems.ipynb

我有没有遗漏什么步骤?为什么我的转化率不高

事先非常感谢


Tags: toimageimporthomedatamodelobjecttf
1条回答
网友
1楼 · 发布于 2024-09-29 23:28:15

正如@JaesungChung所回答的,这个过程做得很好

我的问题出在运行.tflite模型的应用程序上。我将模型输出量化为uint8,因此我必须重新缩放获得的值以获得正确的结果

也就是说,我有10个对象,因为我正在请求分数高于0.5的所有检测到的对象。我的结果没有缩放,因此检测对象的分数可以达到104分。我必须重新调整这个数字除以255的比例

在绘制我的结果时也发生了同样的情况。所以我必须把这个数除以高度和宽度

相关问题 更多 >

    热门问题