yolov5 Detection Box Displays Chinese Labels#
Note: This annotation corresponds to version 5.0 of yolov5!!!#
1. Dataset with Chinese Labels
2. Modify yolov5 Code to Support Chinese Labels
Introduction#
Many people find that when training the yolov5 object detection model, the labels can only display in English. How can we train a model that can detect objects and display Chinese labels? Let's do it step by step.
1. Dataset with Chinese Labels Dataset#
When collecting datasets, many public datasets have labels in English, formatted in VOC format, with files in XML format. The XML format is particularly intuitive, clearly showing the image size, annotation position, and annotation categories. Although the YOLO format dataset is used when training yolov5, it is advisable to annotate the dataset in VOC format for convenience and clarity, and there is code available for one-click conversion when training the YOLOv5 model. Code blog: Object Detection - Dataset Format Conversion and Training/Validation Set Division.
If you want to create a dataset with Chinese labels, you can use the labelimg tool to annotate the dataset and simply change the labels to Chinese. For a detailed tutorial, refer to the blog: Object Detection - Using labelimg to Create Your Own Deep Learning Object Detection Dataset.
If you have a VOC format dataset with English labels, you can use the following code to convert the English labels in the dataset to Chinese labels.
# encoding:utf-8
import os
import xml.etree.ElementTree as ET
count = 0
list_xml = []
dict = {"ball": "足球",
"messi": "梅西",
}
openPath = "VOCdevkit\VOC2007\Annotations"
savePath = "VOCdevkit\VOC2007\Annotations1"
fileList = os.listdir(openPath) # Get the list of all file names in the current working directory
for fileName in fileList: # Get files from the file list
if fileName.endswith(".xml"): # Only look at xml files
print("filename=:", fileName)
tree = ET.parse(os.path.join(openPath, fileName))
root = tree.getroot()
print("root-tag=:", root.tag) # ',root-attrib:', root.attrib, ',root-text:', root.text)
for child in root: # First level parsing
if child.tag == "object": # Find the object tag
print(child.tag)
for sub in child:
if sub.tag == "name":
print("Label name:", sub.tag, "; Text content:", sub.text)
if sub.text not in list_xml:
list_xml.append(sub.text)
if sub.text in list(dict.keys()):
sub.text = dict[sub.text]
print(sub.text)
count = count + 1
tree.write(os.path.join(savePath, fileName), encoding='utf-8')
print("=" * 20)
print(count)
for i in list_xml:
print(i)
This code can also convert a VOC format dataset with Chinese labels to a dataset with English labels. As shown in the figure below, you can simply swap the positions of the Chinese and English labels.
For a VOC format dataset with Chinese labels, you can use the code in the blog Object Detection - Dataset Format Conversion and Training/Validation Set Division to convert the VOC format dataset to a YOLO dataset and divide it into training and validation sets.
Thus, the dataset with Chinese labels is ready.
2. Modify yolov5 Code to Support Chinese Labels#
It is particularly important to note that this blog uses version 5.0 of yolov5.
(1) Modify the train.py file. In line 63 of the py file, change the code as follows:
with open(opt.data, encoding='UTF-8') as f:
(2) Modify the test.py file. In line 73 of the py file, change the code as follows:
with open(opt.data, encoding='UTF-8') as f:
(3) Modify the utils/general.py file. Import the following package in this code:
from PIL import Image, ImageDraw, ImageFont
(4) Modify the utils/plots.py file. In line 64 of the py file, modify the plot_one_box function. Change the code after if label to:
tf = max(tl - 1, 1) # font thickness
t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
font_size = t_size[1]
font = ImageFont.truetype('msyh.ttc', font_size)
t_size = font.getsize(label)
c2 = c1[0] + t_size[0], c1[1] - t_size[1]
cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA) # filled
img_PIL = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
draw = ImageDraw.Draw(img_PIL)
draw.text((c1[0], c2[1] - 2), label, fill=(255, 255, 255), font=font)
return cv2.cvtColor(np.array(img_PIL), cv2.COLOR_RGB2BGR)
(5) Modify the utils/plots.py file. In line 144 of the py file, modify the plot_images function:
mosaic = plot_one_box(box, mosaic, label=label, color=color, line_thickness=tl)
(6) Modify the detect.py file. In line 178 of the py file, change the code as follows:
im0 = plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)
Note that this im0 is very important. If your image is already boxed but the label does not display text, it is likely that this im0
is missing. This im0
must correspond to the im0 in plot_one_box(xyxy, im0
.
Thus, the code to support Chinese labels has been modified.
If you need to refer to the model training and inference testing work later, you can check the blog: Object Detection - Teaching You to Train Your Own Object Detection Model with yolov5.
It is particularly important to note that when modifying the yaml file in the data directory, the labels must correspond to the annotated Chinese categories.
Reference links: (below)
YoloV5 Implementation of Chinese Label Object Detection - Zhihu (zhihu.com)
YoloV5 Implementation of Chinese Label Object Detection#
1. Copy the yolov5 Project Locally#
https://github.com/ultralytics/yolov5
2. Check the Font#
The default font is called from C:\Windows\Fonts
. Find a suitable one, for example, I used simhei.ttf
this time.
3. train.py File
#
with open(opt.data) as f:
Change to
with open(opt.data, encoding='UTF-8') as f:
4. test.py File#
with open(data) as f:
Change to
with open(data, encoding='UTF-8') as f:
5. utils/general.py File#
Import the package
from PIL import Image, ImageDraw, ImageFont
6. utils/plot.py File#
Modify the plot_one_box
function. Change the code after if label
to
if label:
tf = max(tl - 1, 1) # font thickness
t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
font_size = t_size[1]
font = ImageFont.truetype('MSYH.TTC', font_size)
t_size = font.getsize(label)
c2 = c1[0] + t_size[0], c1[1] - t_size[1]
cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA) # filled
img_PIL = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
draw = ImageDraw.Draw(img_PIL)
draw.text((c1[0], c2[1] - 2), label, fill=(255, 255, 255), font=font)
return cv2.cvtColor(np.array(img_PIL), cv2.COLOR_RGB2BGR)
7. utils/plot.py File#
In the plot_images
function
plot_one_box(box, mosaic, label=label, color=color, line_thickness=tl)
Change to
mosaic = plot_one_box(box, mosaic, label=label, color=color, line_thickness=tl)
8. detect.py File#
plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)
Change to
im0 = plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)