YoloV5划分自己的数据集并将json文件(矩形标注)转换为yolo可以识别的txt文件（全过程/服务器上搭建）

1. 划分数据集
将图片数据集划随机划分为训练：验证：测试=6：2：2的比例，同时划分对应的json目录文件。废话不多说，直接上代码：

 import os
import random
import shutil
 
# 指定原始数据目录
images_dir = "images"
json_dir = "Annotations"
 
# 指定目标输出目录
output_dir_images = "../../datasets/fish/images"
output_dir_json = "../../datasets/fish/annotations"
 
# 创建目标输出目录
if not os.path.exists(output_dir_images):
    os.makedirs(output_dir_images)
if not os.path.exists(output_dir_json):
    os.makedirs(output_dir_json)
 
# 创建训练集、验证集和测试集子目录
subdirectories = ["train", "val", "test"]
 
for subdir in subdirectories:
    subdir_path_images = os.path.join(output_dir_images, subdir)
    subdir_path_json = os.path.join(output_dir_json, subdir)
 
    if not os.path.exists(subdir_path_images):
        os.makedirs(subdir_path_images)
    if not os.path.exists(subdir_path_json):
        os.makedirs(subdir_path_json)
 
# 获取图像文件和 JSON 文件列表
image_files = [f for f in os.listdir(images_dir) if f.endswith(".jpg")]
json_files = [f for f in os.listdir(json_dir) if f.endswith(".json")]
 
# 打乱数据的顺序
random.shuffle(image_files)
 
# 计算划分的数据集大小
total_samples = len(image_files)
train_size = int(total_samples * 0.6)
val_size = int(total_samples * 0.2)
test_size = total_samples - train_size - val_size
 
# 划分数据集
train_images = image_files[:train_size]
val_images = image_files[train_size:train_size + val_size]
test_images = image_files[train_size + val_size:]
 
# 移动图像文件到对应的子目录
for image_file in train_images:
    src_path_images = os.path.join(images_dir, image_file)
    dest_path_images = os.path.join(output_dir_images, "train", image_file)
    shutil.copy(src_path_images, dest_path_images)
 
for image_file in val_images:
    src_path_images = os.path.join(images_dir, image_file)
    dest_path_images = os.path.join(output_dir_images, "val", image_file)
    shutil.copy(src_path_images, dest_path_images)
 
for image_file in test_images:
    src_path_images = os.path.join(images_dir, image_file)
    dest_path_images = os.path.join(output_dir_images, "test", image_file)
    shutil.copy(src_path_images, dest_path_images)
 
# 移动 JSON 文件到对应的子目录
for json_file in json_files:
    base_name = os.path.splitext(json_file)[0]
    if f"{base_name}.jpg" in train_images:
        src_path_json = os.path.join(json_dir, json_file)
        dest_path_json = os.path.join(output_dir_json, "train", json_file)
        shutil.copy(src_path_json, dest_path_json)
    elif f"{base_name}.jpg" in val_images:
        src_path_json = os.path.join(json_dir, json_file)
        dest_path_json = os.path.join(output_dir_json, "val", json_file)
        shutil.copy(src_path_json, dest_path_json)
    elif f"{base_name}.jpg" in test_images:
        src_path_json = os.path.join(json_dir, json_file)
        dest_path_json = os.path.join(output_dir_json, "test", json_file)
        shutil.copy(src_path_json, dest_path_json)
 
print("数据集划分完成。")
 复制

路径都用的是相对路径，如果不理解相对路径，可以这么理解：就是以此文件的路径开始，**举个例子：**我的images图片目录和此划分数据集的文件在同一个目录之下，则输入图片目录的时候，就直接写入图片的目录名称即可：
在这里插入图片描述

可以从图中看到，我的划分代码和images目录在在同一个目录VOCData下面。所以输入的图片目录就是：
在这里插入图片描述
…/ 这里的…（两个.）代表上级目录，因为我是用linux服务器进行划分的。
以下是我的目录格式，可以进行参考：

执行过后，两个数据集(images，json)就被划分为了train，val,test三个子目录。并且这两个数据集的子目录的文件都是一一对应的。如图所示：
在这里插入图片描述
此刻，划分数据集到此结束！
2. json转txt
因为上一步把json的文件夹划分为了train，val和test三个子目录，接下来要做的就是将这三个子目录的json文件都转换为txt文件。这个可以参考我的上一篇文章：python实现将一个文件夹中的多个 JSON 文件批量转换为 YOLO 可以识别的 TXT 文件

YoloV5划分自己的数据集并将json文件(矩形标注)转换为yolo可以识别的txt文件（全过程/服务器上搭建）

基于CSS3媒体查询的响应式旅游网站设计与实现-计算机毕设附源码 12755

Postman导出JSON文件轻松转换为HTML或Markdown

com.google.gson.JsonSyntaxException: IllegalStateException: Expected BEGIN_OBJECT but was STRING at

python requests编写 api接收json

用Python开发桌面端软件：pywebview (框架) Python (后端) vue (前端) pyinstaller (打包)

前端-javaScript:jquery补充

基于Vue的求职招聘系统的设计与实现-计算机毕设附源码 25284

java中Object和json相互转换的方式

Flutter Dart Macro 宏简化 JSON 序列化

中国地区 code.json文件

前端哥

运行npm error code ENOENTnpm error syscall opennpm error path C:\Users\ultra\Desktop\Vue-Project\pac

前端提高篇（102）：jQuery高级方法callbacks、deferred

解决npm install 报错 “npm err code 1“

【常见错误】npm ERR! code CERT_HAS_EXPIRED & errno CERT_HAS_EXPIRED

vue前端页面弹出红色报错遮罩层 Uncaught runtime errors:at handleError (webpack-internal:///./node_modules/webpack

npm ERR! code CERT_HAS_EXPIRED npm ERR! errno CERT_HAS_EXPIRED npm ERR! request to https://registry.

JQuery中的load()、$

《WEB前端框架开发技术》HTML5响应式旅游景区网站——榆林子州HTML CSS JavaScript (1)

基于Java SpringBoot Vue HTML5药店管理系统(源码 LW 调试文档讲解等)/药店管理软件/药店进销存系统/药店库存管理系统/药店销售系统/药品管理系统/药店收银系统

基于Java SpringBoot Vue HTML5宠物健康顾问系统(源码 LW 调试文档讲解等)/宠物健康/顾问系统/宠物护理/宠物医疗/宠物保健/宠物咨询/宠物医生/宠物健康管理/宠物健康服务

1
【Echarts系列】—— 实现电池图、3D立体圆形柱状图

2024-03-03 11:03:011001

2
CSS常用属性（文本属性）

2024-11-04 09:11:111000

3
TypeScript 中的 Number 类型，Number 类型的特性、常见操作和注意事项

2024-09-30 23:09:061000

4
CSS写代码使页面划分为左右两个区域

2024-09-09 00:09:071000

5
vue使用datav echarts

2024-09-06 00:09:381000

6
使用TweenMax.js和CSS3创建冰球运动员动画效果教程

2024-09-04 23:09:411000

7
使用CDN提高jQuery加载速度

2024-08-24 23:08:211000

8
小兔鲜儿网页首页制作黑马程序员前端基础项目自学笔记

2024-08-19 22:08:161000

9
《Vue》你的弹窗能拖动吗？Vue自定义指令实现可拖动弹窗

2024-08-19 22:08:121000

10
npm的使用

2024-08-18 00:08:131000

	import os
	import random
	import shutil

	# 指定原始数据目录
	images_dir = "images"
	json_dir = "Annotations"

	# 指定目标输出目录
	output_dir_images = "../../datasets/fish/images"
	output_dir_json = "../../datasets/fish/annotations"

	# 创建目标输出目录
	if not os.path.exists(output_dir_images):
	os.makedirs(output_dir_images)
	if not os.path.exists(output_dir_json):
	os.makedirs(output_dir_json)

	# 创建训练集、验证集和测试集子目录
	subdirectories = ["train", "val", "test"]

	for subdir in subdirectories:
	subdir_path_images = os.path.join(output_dir_images, subdir)
	subdir_path_json = os.path.join(output_dir_json, subdir)

	if not os.path.exists(subdir_path_images):
	os.makedirs(subdir_path_images)
	if not os.path.exists(subdir_path_json):
	os.makedirs(subdir_path_json)

	# 获取图像文件和 JSON 文件列表
	image_files = [f for f in os.listdir(images_dir) if f.endswith(".jpg")]
	json_files = [f for f in os.listdir(json_dir) if f.endswith(".json")]

	# 打乱数据的顺序
	random.shuffle(image_files)

	# 计算划分的数据集大小
	total_samples = len(image_files)
	train_size = int(total_samples * 0.6)
	val_size = int(total_samples * 0.2)
	test_size = total_samples - train_size - val_size

	# 划分数据集
	train_images = image_files[:train_size]
	val_images = image_files[train_size:train_size + val_size]
	test_images = image_files[train_size + val_size:]

	# 移动图像文件到对应的子目录
	for image_file in train_images:
	src_path_images = os.path.join(images_dir, image_file)
	dest_path_images = os.path.join(output_dir_images, "train", image_file)
	shutil.copy(src_path_images, dest_path_images)

	for image_file in val_images:
	src_path_images = os.path.join(images_dir, image_file)
	dest_path_images = os.path.join(output_dir_images, "val", image_file)
	shutil.copy(src_path_images, dest_path_images)

	for image_file in test_images:
	src_path_images = os.path.join(images_dir, image_file)
	dest_path_images = os.path.join(output_dir_images, "test", image_file)
	shutil.copy(src_path_images, dest_path_images)

	# 移动 JSON 文件到对应的子目录
	for json_file in json_files:
	base_name = os.path.splitext(json_file)[0]
	if f"{base_name}.jpg" in train_images:
	src_path_json = os.path.join(json_dir, json_file)
	dest_path_json = os.path.join(output_dir_json, "train", json_file)
	shutil.copy(src_path_json, dest_path_json)
	elif f"{base_name}.jpg" in val_images:
	src_path_json = os.path.join(json_dir, json_file)
	dest_path_json = os.path.join(output_dir_json, "val", json_file)
	shutil.copy(src_path_json, dest_path_json)
	elif f"{base_name}.jpg" in test_images:
	src_path_json = os.path.join(json_dir, json_file)
	dest_path_json = os.path.join(output_dir_json, "test", json_file)
	shutil.copy(src_path_json, dest_path_json)

	print("数据集划分完成。")

YoloV5划分自己的数据集并将json文件(矩形标注)转换为yolo可以识别的txt文件（全过程/服务器上搭建）

微信扫一扫：分享