Train custom coco dataset using Yolox

Posted on 2021-10-15 Edited on 2025-10-02 In 提高姿势水平 Views:

Symbols count in article: 6.4k Reading time ≈ 18 mins.

yolox, funny

Pre-environment

windows 10
GPU: Nvidia 3090
Anaconda3
pycharm

Create a virtual environment

Launch Anaconda Prompt

1 2	conda create -n yolox python=3.8 conda activate yolox

Create the project in Pycharm

Launch Pycharm
Create a new project, and choose the interpreter we established in anaconda
Create the project
open the directory of Yolox which is downloaded from Github

Install Pytorch

In Pycharm's Terminal

1	conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c conda-forge

Install Yolox

1
2
3

cd YOLOX
pip install -U pip && pip install -r requirements.txt
pip install -v -e .  # or  python setup.py develop

Install pycocotools

1 2	pip install cython pip install pycocotools

Install apex

1
2
3

git clone https://github.com/NVIDIA/apex
cd apex
python setup.py install

validate

download model from Github, and save the model in the folder named weights

validate

1 2	python tools/demo.py image -n yolox-s -c weights/yolox_s.pth --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result --device gpu

Train custom coco dataset using Yolox

Build dataset

Using labelImg to label your own images, export the labels information in YOLO format(txt)

convert txt format to COCO dataset format

import os
import json
import cv2
import random
import time
from PIL import Image

coco_format_save_path='C:\\Users\\11734\\Desktop\\YOLOX-main\\mydata\\val\\'                      #要生成的标准coco格式标签所在文件夹
yolo_format_classes_path='C:\\Users\\11734\\Desktop\\YOLOX-main\\mydata\\yy.names'                   #类别文件，一行一个类
yolo_format_annotation_path='C:\\Users\\11734\\Desktop\\YOLOX-main\\mydata\\val\\labels\\'        #yolo格式标签所在文件夹
img_pathDir='C:\\Users\\11734\\Desktop\\YOLOX-main\\mydata\\val\\images\\'                        #图片所在文件夹

with open(yolo_format_classes_path,'r') as fr:                                                      #打开并读取类别文件
    lines1=fr.readlines()
# print(lines1)
categories=[]                                                                 #存储类别的列表
for j,label in enumerate(lines1):
    label=label.strip()
    categories.append({'id':j+1,'name':label,'supercategory':'None'})         #将类别信息添加到categories中
# print(categories)

write_json_context=dict()                                                      #写入.json文件的大字典
write_json_context['info']= {'description': '', 'url': '', 'version': '', 'year': 2021, 'contributor': '', 'date_created': '2021-10-13'}
write_json_context['licenses']=[{'id':1,'name':None,'url':None}]
write_json_context['categories']=categories
write_json_context['images']=[]
write_json_context['annotations']=[]

#接下来的代码主要添加'images'和'annotations'的key值
imageFileList=os.listdir(img_pathDir)                                           #遍历该文件夹下的所有文件，并将所有文件名添加到列表中
for i,imageFile in enumerate(imageFileList):
    imagePath = os.path.join(img_pathDir,imageFile)                             #获取图片的绝对路径
    image = Image.open(imagePath)                                               #读取图片，然后获取图片的宽和高
    W, H = image.size

    img_context={}                                                              #使用一个字典存储该图片信息
    #img_name=os.path.basename(imagePath)                                       #返回path最后的文件名。如果path以/或\结尾，那么就会返回空值
    img_context['file_name']=imageFile
    img_context['height']=H
    img_context['width']=W
    img_context['date_captured']='2021-07-25'
    img_context['id']=i                                                         #该图片的id
    img_context['license']=1
    img_context['color_url']=''
    img_context['flickr_url']=''
    write_json_context['images'].append(img_context)                            #将该图片信息添加到'image'列表中


    txtFile=imageFile[:21]+'.txt'                                               #获取该图片获取的txt文件,注意位数

    with open(os.path.join(yolo_format_annotation_path,txtFile),'r') as fr:
        lines=fr.readlines()                                                   #读取txt文件的每一行数据，lines2是一个列表，包含了一个图片的所有标注信息
    for j,line in enumerate(lines):

        bbox_dict = {}                                                          #将每一个bounding box信息存储在该字典中
        # line = line.strip().split()
        # print(line.strip().split(' '))

        class_id,x,y,w,h=line.strip().split(' ')                                          #获取每一个标注框的详细信息
        class_id,x, y, w, h = int(class_id), float(x), float(y), float(w), float(h)       #将字符串类型转为可计算的int和float类型

        xmin=(x-w/2)*W                                                                    #坐标转换
        ymin=(y-h/2)*H
        xmax=(x+w/2)*W
        ymax=(y+h/2)*H
        w=w*W
        h=h*H

        bbox_dict['id']=i*10000+j                                                         #bounding box的坐标信息
        bbox_dict['image_id']=i
        bbox_dict['category_id']=class_id+1                                               #注意目标类别要加一
        bbox_dict['iscrowd']=0
        height,width=abs(ymax-ymin),abs(xmax-xmin)
        bbox_dict['area']=height*width
        bbox_dict['bbox']=[xmin,ymin,w,h]
        bbox_dict['segmentation']=[[xmin,ymin,xmax,ymin,xmax,ymax,xmin,ymax]]
        write_json_context['annotations'].append(bbox_dict)                               #将每一个由字典存储的bounding box信息添加到'annotations'列表中

name = os.path.join(coco_format_save_path,"train"+ '.json')
with open(name,'w') as fw:                                                                #将字典信息写入.json文件中
    json.dump(write_json_context,fw,indent=2)

get instances_train2017.json and instances_val2017.json, which contain the information of images and labels in coco dataset format

Build custom coco dataset at yolox directory, obey the rules of nomenclature, where instances_train2017.json is

Animals_Coco
   ├─annotations
     --instances_train2017.json
     --instances_val2017.json
   ├─train2017
   └─val2017

put images of training set into train2017, put images of validating into val2017

Modify the classes name

In yolox/data/datasets/coco_classes.py

Modify yolox_base.py

In yolox/exp/yolox_base.py

Modify yolox_base.py

In exps/example/custom/yolox_s.py

Modify train.py

In tools/train.py

def make_parser():
    parser = argparse.ArgumentParser("YOLOX train parser")
    parser.add_argument("-expn", "--experiment-name", type=str, default=None)
    # set default="Animals_Coco")#
    parser.add_argument("-n", "--name", type=str, default=None, help="model name")
    # set default="yolox-s")#

    parser.add_argument("-b", "--batch-size", type=int, default=64, help="batch size")
    # modify batch size as 8
    parser.add_argument("-d", "--devices", default=0, type=int, help="device for training")
    # set dafault as 0 since there is only one gpu

    parser.add_argument(
        "-f",
        "--exp_file",
        default="C:\\Users\\11734\\Desktop\\YOLOX\\exps\\example\\custom\\yolox_s.py",
    # set path of yolox_s.py

    parser.add_argument("-c", "--ckpt", default="C:\\Users\\11734\\Desktop\\YOLOX\\weights\\yolox_s.pth", type=str, help="checkpoint file")
    # set pre-training weights

Train

start train, pay attention to the batch size , not exceed the memory of gpu

Test

1
2
3

python tools/demo.py video -n yolox-s -c  tools/YOLOX_outputs/Animals_Coco/last_epoch_ckpt.pth    --path    qq.avi  --conf 0.25 --nm
s 0.45 --tsize 640 --save_result --device gpu

well done !