如何基于yolo将特定格式的json标签文件转换为txt标签文件?

2024-09-27 01:26:25 发布

您现在位置:Python中文网/ 问答频道 /正文

我收到了jpg格式的图像数据集和JSON格式的标签,我在尝试使用darknet yolov4进行训练时遇到了问题

JSON格式的标签如下所示

"annotations": [
    {
        "image_id": 0,
        "file_name": "image_47010552850673.jpg",
        "objects": [
            {
                "object_id": 0,
                "class": "person",
                "position": [1480, 151, 1508, 169]
            },
            {
                "object_id": 1,
                "class": "car",
                "position": [792, 123, 843, 246]
            },
            {
                "object_id": 2,
                "class": "person",
                "position": [245, 667, 286, 695]
            }
        ]
    },
    {
        "image_id": 1,
        "file_name": "image_68475401035381.jpg",
        "objects": [
            {
                "object_id": 3,
                "class": "person",
                "position": [1090, 374, 1096, 389]
            },
            {
                "object_id": 4,
                "class": "car",
                "position": [1279, 620, 1346, 655]
            }
        ]
    }, ...

该职位的格式如下

position = [xmin, ymin, xmax, ymax] <- pixel values

总共有六个类别,每个标签类别的相应编号如下所示

汽车:0辆,卡车:1辆,公共汽车:2辆,其他车辆:3辆,自行车:4辆,人:5辆

图像大小为1920x1080

它有大约100000个图像和36GB的容量

所有图像都标记在一个JSON文件中,JSON文件的容量约为124MB

我想将上面的JSON文件转换为标准化的yolo格式的文本文件

Ex)

文件名:image_name.txt

内容:

class_number normalized_center_x normalized_center_y normalized_width normalized_height

另外,

normalized_centered_x = (xmin+xmax)÷2÷x_sizeof_image

normalized_centered_y = (ymin+ymax)÷2÷y_sizeof_image

normalized_width = (xmax-xmin)÷x_sizeof_image

normalized_height = (ymax-ymin)÷y_sizeof_image

就我而言

normalized_centered_x = (position[0]+position[2])÷2÷1920

normalized_centered_y = (position[1]+position[3])÷2÷1080

normalized_width = (position[2]-position[0])÷1920

normalized_height = (position[3]-position[1])÷1080

上述json的darknet yolov4 txt文件列表的实际示例如下所示

文件名:image_47010552850673.txt

内容:

5 0.778125 0.148148148 0.014583333 0.016666667

0 0.42578125 0.170833333 0.0265625 0.113888889

5 0.13828125 0.630555556 0.021354167 0.025925926

文件名:image_68475401035381.txt

内容:

5 0.569270833 0.353240741 0.003125 0.013888889

0 0.68359375 0.590277778 0.034895833 0.032407407

如何在Python中实现这一点


Tags: 文件图像imagetxtidjsonobject格式
1条回答
网友
1楼 · 发布于 2024-09-27 01:26:25

我发现自己在回答我提出的问题

import json


classes = ["car", "truck", "bus", "etc vehicle", "bike", "person"]


# box form[x,y,w,h]
def convert(size, box):
    dw = 1. / size[0]
    dh = 1. / size[1]
    x = (box[0] + box[2]) * dw / 2
    y = (box[1] + box[3]) * dh / 2
    w = (box[2] - box[0]) * dw
    h = (box[3] - box[1]) * dh
    return (x, y, w, h)


def convert_annotation():
    with open('labels.json', 'r') as f:
        datas = json.load(f)
        data = datas["annotations"]
        width = 1920
        height = 1080
    for item1 in data:
        file_name = item1["file_name"]
        objects = item1["objects"]
        outfile = open('./darknet2/%s.txt' % (file_name[:-4]), 'a+')
        for item2 in objects:
            cls = item2["class"]
            cls_id = classes.index(cls)
            box = item2["position"]
            bb = convert((width, height), box)
            outfile.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
        outfile.close()


if __name__ == '__main__':
    convert_annotation()

相关问题 更多 >

    热门问题