Yolov5 多边形标签转换，所有json文件自动转成txt格式[详细过程]

问题引入

Labelme简要介绍

多边形标签的处理方法

转换后的txt格式如下：

代码实现

多边形标签代码实现方法

json转化为txt的部分代码如下：

数字规范化的代码如下：

最后附上我的完整代码

问题引入

网上的json转化为txt的教程都比较简要，查找了很多资料之后，再自己一个一个的运行代码最后才终于知道其原理。

Labelme简要介绍

通过labelme对图进行标注后，得到的是json文件，而Yolov5对数据进行模型构建的时候，读取需要的是txt格式的文件。所以需要先通过Python进行文件格式的转换

注：labelme是麻省理工（MIT）的计算机科学和人工智能实验室（CSAIL）研发的图像标注工具，人们可以使用该工具创建定制化标注任务或执行图像标注，项目源代码已经开源。

Labelme程序运行，通过标注后如图所示：

图1 Labelme标注

此图片可以得到以下格式的json文件：

文件中的字段如下：

‘version’——版本号

‘shapes’——里面装的是Yolov5需要的数据

‘label’——你在labelme里面设置的类

‘points’——点的坐标

我这里的label如图1所示共有5类，等下进行json转化为txt的时候用

对应这些类创一个字典以便json进行转换

例：name2id={'bike':0,'arrow':1,'crossline':2,'building':3,'car':4,'person':5}

可能某一张图片中可能不存在上述的某个类，所以这里请以某个json中最多的类创建这个字典。

多边形标签的处理方法

由于yolov5 仅支持矩形图形的识别，所以需要通过数据处理，将多边形变换为矩形。

处理原理：遍历该标签所有的坐标，获取最大x_max,y_max,最小x_min,y_min的x和y的坐标。

然后再进行数据的规范化。

转换后的txt格式如下：

第一个是类，比如第一行中的第一个数字是4，我的name2id中car也为4，即这里指代的就是'car'这个标签。

第一行第二个和第三个数字为数字为图片中心点(x,y)的坐标

第四个数字和第五个数字对应的是这个标签的宽和高。

代码实现

多边形标签代码实现方法

                 x_max=0
                y_max=0
                x_min=float("inf")
                y_min=float("inf")
                for lk in range(len(i['points'])):
                    x1=float(i['points'][lk][0])
                    y1=float(i['points'][lk][1])
                    if x_max<x1:
                        x_max=x1
                    if y_max<y1:
                        y_max=y1
                    if y_min>y1:
                        y_min=y1
                    if x_min>x1:
                        x_min=x1
                bb = (x_min, y_max, x_max, y_min)复制

json转化为txt的部分代码如下：

 import os
def decode_json(json_floder_path, txt_outer_path, json_name, is_convert=True):
    txt_name = os.path.join(txt_outer_path, json_name[:-5] + '.txt')
    with open(txt_name, 'w') as f:
        json_path = os.path.join(json_floder_path, json_name)
        data = json.load(open(json_path, 'r', encoding='gb2312', errors='ignore'))
        img_w = data['imageWidth']
        img_h = data['imageHeight']
        isshape_type = data['shapes'][0]['shape_type']
        print(isshape_type)
        for i in data['shapes']:
            label_name = i['label']
            if (i['shape_type'] == 'polygon'):
                x_max = 0
                y_max = 0
                x_min = 100000
                y_min = 100000
                for lk in range(len(i['points'])):
                    x1 = float(i['points'][lk][0])
                    y1 = float(i['points'][lk][1])
                    if x_max < x1:
                        x_max = x1
                    if y_max < y1:
                        y_max = y1
                    if y_min > y1:
                        y_min = y1
                    if x_min > x1:
                        x_min = x1
                bb = (x_min, y_max, x_max, y_min)
            if (i['shape_type'] == 'rectangle'):
                x1 = float(i['points'][0][0])
                y1 = float(i['points'][0][1])
                x2 = float(i['points'][1][0])
                y2 = float(i['points'][1][1])
                bb = (x1, y1, x2, y2)
            if is_convert:
                bbox = convert((img_w, img_h), bb)
            else:
                bbox = bb
            try:
                f.write(str(name2id[label_name]) + " " + " ".join([str(a) for a in bbox]) + '\n')
            except:
                pass复制

json_floder——读取json的文件夹的绝对路径

txt_outer_path——保存txt文本的文件夹的绝对路径

json_name——json_name是json文本的名字

读取该类的坐标然后进行数字规范化

数字规范化的代码如下：

会将坐标缩放至(0——1）区间

 def convert(img_size, box):
    dw = 1. / (img_size[0])
    dh = 1. / (img_size[1])
    x = (box[0] + box[2]) / 2.0 
    y = (box[1] + box[3]) / 2.0 
    w = abs(box[2] - box[0])
    h = abs(box[3] - box[1])#加入绝对值的原因是只需要它的宽和高，以免出现负数
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return (x, y, w, h)复制

最后附上我的完整代码：

 import json
import os
 
name2id = {'bike': 0, 'arrow': 1, 'crossline': 2, 'building': 3, 'car': 4, 'person': 5}
 
 
def convert(img_size, box):
    dw = 1. / (img_size[0])
    dh = 1. / (img_size[1])
    x = (box[0] + box[2]) / 2.0
    y = (box[1] + box[3]) / 2.0
    w = abs(box[2] - box[0])
    h = abs(box[3] - box[1])
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return (x, y, w, h)
 
def decode_json(json_floder_path, txt_outer_path, json_name, is_convert=True):
 
    if not json_name.endswith(".json"):
        return
    txt_name = os.path.join(txt_outer_path, json_name[json_name.rfind("/")+1:-5] + '.txt')
 
    with open(txt_name, 'w') as f:
        json_path = os.path.join(json_floder_path, json_name)
        data = json.load(open(json_path, 'r', encoding='gb2312', errors='ignore'))
        img_w = data['imageWidth']
        img_h = data['imageHeight']
        isshape_type = data['shapes'][0]['shape_type']
        print(isshape_type)
        for i in data['shapes']:
            label_name = i['label']
            if (i['shape_type'] == 'polygon'):
                x_max = 0
                y_max = 0
                x_min = float("inf")
                y_min = float("inf")
                for lk in range(len(i['points'])):
                    x1 = float(i['points'][lk][0])
                    y1 = float(i['points'][lk][1])
                    if x_max < x1:
                        x_max = x1
                    if y_max < y1:
                        y_max = y1
                    if y_min > y1:
                        y_min = y1
                    if x_min > x1:
                        x_min = x1
                bb = (x_min, y_max, x_max, y_min)
            if (i['shape_type'] == 'rectangle'):
                x1 = float(i['points'][0][0])
                y1 = float(i['points'][0][1])
                x2 = float(i['points'][1][0])
                y2 = float(i['points'][1][1])
                bb = (x1, y1, x2, y2)
            if is_convert:
                bbox = convert((img_w, img_h), bb)
            else:
                bbox = bb
            try:
                f.write(str(name2id[label_name]) + " " + " ".join([str(a) for a in bbox]) + '\n')
            except:
                pass
 
 
if __name__ == "__main__":
    json_floder_path = r'/home/xx/gitlab/demo/data1/' #请将json文件放在该文件夹下
    txt_outer_path = r'/home/xx/gitlab/demo/data1/' 
    json_names = os.listdir(json_floder_path)
    print("共有：{}个文件待转化".format(len(json_names)))
    flagcount = 0
    for json_name in json_names:
        decode_json(json_floder_path, txt_outer_path, json_name, is_convert=False)  # 这里设置是否要缩放坐标，如果为False则不用缩放
        flagcount += 1
        print("还剩下{}个文件未转化".format(len(json_names) - flagcount))
    print('转化全部完毕')复制

如果想查看JSON转化后的txt文本是否能够正常显示可以执行以下代码，注意查看前请先把上述代码中的

decode_json(json_floder_path, txt_outer_path, json_name, is_convert=False), is_convert改为False（表示不需要将坐标进行0-1区间放缩）

展示单张图片只需要调用display_single_image(image_path, label_folder)函数

 import os
import cv2
 
# 定义标签颜色（每个类别一个颜色）
label_colors = [(0, 0, 255), (0, 255, 0), (255, 0, 0), (0, 255, 255)]  # 可根据需要添加更多颜色
 
def display_single_image(image_path, label_folder):
    # 读取图像
    image = cv2.imread(image_path)
 
    # 读取相应的标签文件
    label_file = os.path.splitext(os.path.basename(image_path))[0] + '.txt'
    label_path = os.path.join(label_folder, label_file)
 
    # 检查标签文件是否存在
    if not os.path.exists(label_path):
        print(f"标签文件 {label_file} 不存在。")
        return
 
    with open(label_path, 'r') as file:
        lines = file.readlines()
 
    for line in lines:
        parts = line.strip().split()
        label_number = int(parts[0])
        coordinates = [float(x) for x in parts[1:]]
        xmin, ymin, xmax, ymax = map(int, coordinates)
        # 将归一化坐标还原为图像坐标
        cv2.rectangle(image, (xmin, ymin), (xmax, ymax), color=label_colors[label_number], thickness=2)
 
    # 显示可视化结果
    cv2.imshow("Single Image", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
 
 
def display_all_images(image_folder, label_folder):
    # 获取图像文件列表
    image_files = os.listdir(image_folder)
 
    for image_file in image_files:
        image_path = os.path.join(image_folder, image_file)
        display_single_image(image_path, label_folder)
 
if __name__ == "__main__":
# 指定图像和标签文件夹的路径
    image_folder = "/home/zhangchang/gitlab/demo/images"
    label_folder = "/home/zhangchang/gitlab/demo/labels"
 
    # 显示所有图片
    #display_all_images(image_folder, label_folder)
 
    # 指定要显示的单张图片的文件名（假设文件名为example.jpg）
    image_to_display = "test.jpg"
    image_path = os.path.join(image_folder, image_to_display)
 
    # 显示单张图片
    display_single_image(image_path, label_folder)复制

os.listdir——读取文件下的所有文件。请先创建一个只有json文件的文件夹

记得先修改！！！if __name__==“__main__”后的参数

json_floder_path——改成你自己的json文件夹路径

txt_outer_path——改成你自己的输出文件夹路径

如果不需要进行坐标缩放，请修改is_convert=True参数，修改为is_convert=False

上述代码功能集成到了Data-Craft-works数据转换工厂，欢迎大家一起来建设！～

	x_max=0
	y_max=0
	x_min=float("inf")
	y_min=float("inf")
	for lk in range(len(i['points'])):
	x1=float(i['points'][lk][0])
	y1=float(i['points'][lk][1])
	if x_max<x1:
	x_max=x1
	if y_max<y1:
	y_max=y1
	if y_min>y1:
	y_min=y1
	if x_min>x1:
	x_min=x1
	bb = (x_min, y_max, x_max, y_min)

	import os
	def decode_json(json_floder_path, txt_outer_path, json_name, is_convert=True):
	txt_name = os.path.join(txt_outer_path, json_name[:-5] + '.txt')
	with open(txt_name, 'w') as f:
	json_path = os.path.join(json_floder_path, json_name)
	data = json.load(open(json_path, 'r', encoding='gb2312', errors='ignore'))
	img_w = data['imageWidth']
	img_h = data['imageHeight']
	isshape_type = data['shapes'][0]['shape_type']
	print(isshape_type)
	for i in data['shapes']:
	label_name = i['label']
	if (i['shape_type'] == 'polygon'):
	x_max = 0
	y_max = 0
	x_min = 100000
	y_min = 100000
	for lk in range(len(i['points'])):
	x1 = float(i['points'][lk][0])
	y1 = float(i['points'][lk][1])
	if x_max < x1:
	x_max = x1
	if y_max < y1:
	y_max = y1
	if y_min > y1:
	y_min = y1
	if x_min > x1:
	x_min = x1
	bb = (x_min, y_max, x_max, y_min)
	if (i['shape_type'] == 'rectangle'):
	x1 = float(i['points'][0][0])
	y1 = float(i['points'][0][1])
	x2 = float(i['points'][1][0])
	y2 = float(i['points'][1][1])
	bb = (x1, y1, x2, y2)
	if is_convert:
	bbox = convert((img_w, img_h), bb)
	else:
	bbox = bb
	try:
	f.write(str(name2id[label_name]) + " " + " ".join([str(a) for a in bbox]) + '\n')
	except:
	pass

	def convert(img_size, box):
	dw = 1. / (img_size[0])
	dh = 1. / (img_size[1])
	x = (box[0] + box[2]) / 2.0
	y = (box[1] + box[3]) / 2.0
	w = abs(box[2] - box[0])
	h = abs(box[3] - box[1])#加入绝对值的原因是只需要它的宽和高，以免出现负数
	x = x * dw
	w = w * dw
	y = y * dh
	h = h * dh
	return (x, y, w, h)

	import json
	import os

	name2id = {'bike': 0, 'arrow': 1, 'crossline': 2, 'building': 3, 'car': 4, 'person': 5}


	def convert(img_size, box):
	dw = 1. / (img_size[0])
	dh = 1. / (img_size[1])
	x = (box[0] + box[2]) / 2.0
	y = (box[1] + box[3]) / 2.0
	w = abs(box[2] - box[0])
	h = abs(box[3] - box[1])
	x = x * dw
	w = w * dw
	y = y * dh
	h = h * dh
	return (x, y, w, h)

	def decode_json(json_floder_path, txt_outer_path, json_name, is_convert=True):

	if not json_name.endswith(".json"):
	return
	txt_name = os.path.join(txt_outer_path, json_name[json_name.rfind("/")+1:-5] + '.txt')

	with open(txt_name, 'w') as f:
	json_path = os.path.join(json_floder_path, json_name)
	data = json.load(open(json_path, 'r', encoding='gb2312', errors='ignore'))
	img_w = data['imageWidth']
	img_h = data['imageHeight']
	isshape_type = data['shapes'][0]['shape_type']
	print(isshape_type)
	for i in data['shapes']:
	label_name = i['label']
	if (i['shape_type'] == 'polygon'):
	x_max = 0
	y_max = 0
	x_min = float("inf")
	y_min = float("inf")
	for lk in range(len(i['points'])):
	x1 = float(i['points'][lk][0])
	y1 = float(i['points'][lk][1])
	if x_max < x1:
	x_max = x1
	if y_max < y1:
	y_max = y1
	if y_min > y1:
	y_min = y1
	if x_min > x1:
	x_min = x1
	bb = (x_min, y_max, x_max, y_min)
	if (i['shape_type'] == 'rectangle'):
	x1 = float(i['points'][0][0])
	y1 = float(i['points'][0][1])
	x2 = float(i['points'][1][0])
	y2 = float(i['points'][1][1])
	bb = (x1, y1, x2, y2)
	if is_convert:
	bbox = convert((img_w, img_h), bb)
	else:
	bbox = bb
	try:
	f.write(str(name2id[label_name]) + " " + " ".join([str(a) for a in bbox]) + '\n')
	except:
	pass


	if __name__ == "__main__":
	json_floder_path = r'/home/xx/gitlab/demo/data1/' #请将json文件放在该文件夹下
	txt_outer_path = r'/home/xx/gitlab/demo/data1/'
	json_names = os.listdir(json_floder_path)
	print("共有：{}个文件待转化".format(len(json_names)))
	flagcount = 0
	for json_name in json_names:
	decode_json(json_floder_path, txt_outer_path, json_name, is_convert=False) # 这里设置是否要缩放坐标，如果为False则不用缩放
	flagcount += 1
	print("还剩下{}个文件未转化".format(len(json_names) - flagcount))
	print('转化全部完毕')

	import os
	import cv2

	# 定义标签颜色（每个类别一个颜色）
	label_colors = [(0, 0, 255), (0, 255, 0), (255, 0, 0), (0, 255, 255)] # 可根据需要添加更多颜色

	def display_single_image(image_path, label_folder):
	# 读取图像
	image = cv2.imread(image_path)

	# 读取相应的标签文件
	label_file = os.path.splitext(os.path.basename(image_path))[0] + '.txt'
	label_path = os.path.join(label_folder, label_file)

	# 检查标签文件是否存在
	if not os.path.exists(label_path):
	print(f"标签文件 {label_file} 不存在。")
	return

	with open(label_path, 'r') as file:
	lines = file.readlines()

	for line in lines:
	parts = line.strip().split()
	label_number = int(parts[0])
	coordinates = [float(x) for x in parts[1:]]
	xmin, ymin, xmax, ymax = map(int, coordinates)
	# 将归一化坐标还原为图像坐标
	cv2.rectangle(image, (xmin, ymin), (xmax, ymax), color=label_colors[label_number], thickness=2)

	# 显示可视化结果
	cv2.imshow("Single Image", image)
	cv2.waitKey(0)
	cv2.destroyAllWindows()


	def display_all_images(image_folder, label_folder):
	# 获取图像文件列表
	image_files = os.listdir(image_folder)

	for image_file in image_files:
	image_path = os.path.join(image_folder, image_file)
	display_single_image(image_path, label_folder)

	if __name__ == "__main__":
	# 指定图像和标签文件夹的路径
	image_folder = "/home/zhangchang/gitlab/demo/images"
	label_folder = "/home/zhangchang/gitlab/demo/labels"

	# 显示所有图片
	#display_all_images(image_folder, label_folder)

	# 指定要显示的单张图片的文件名（假设文件名为example.jpg）
	image_to_display = "test.jpg"
	image_path = os.path.join(image_folder, image_to_display)

	# 显示单张图片
	display_single_image(image_path, label_folder)

Yolov5 多边形标签转换，所有json文件自动转成txt格式[详细过程]

问题引入

Labelme简要介绍

多边形标签的处理方法

转换后的txt格式如下：

代码实现

多边形标签代码实现方法

json转化为txt的部分代码如下：

数字规范化的代码如下：

最后附上我的完整代码：

如果想查看JSON转化后的txt文本是否能够正常显示可以执行以下代码，注意查看前请先把上述代码中的

decode_json(json_floder_path, txt_outer_path, json_name, is_convert=False), is_convert改为False（表示不需要将坐标进行0-1区间放缩）

微信扫一扫：分享