GitHub - zjykzj/YOLO11Face: [ultralytics v8.3.75][yolov8/yolo11-pose][WIDER FACE]Upgrade YOLO5Face to YOLO8Face and YOLO11Face

«YOLO11Face» combined YOLO5Face and YOLOv8/YOLO11 for face and keypoint detection

This warehouse has attempted to train two model architectures in total. The first one is to train and validate the WIDERFACE dataset using only the yolov5/yolov8/yolo11 detection model architecture.

	ARCH	GFLOPs	Easy	Medium	Hard
zjykzj/YOLO11Face	yolov5nu	7.1	93.86	91.70	80.37
zjykzj/YOLO11Face	yolov5su	23.8	95.13	93.47	84.33
zjykzj/YOLO11Face	yolov8s	28.4	95.77	94.18	84.54
zjykzj/YOLO11Face	yolo11s	21.3	95.55	93.91	84.85

The second method uses Ultralytics' pose model for joint training of faces and keypoints, and finally evaluates only the facial performance of the validation set in the original way.

Note that the facial keypoint annotation here comes from RetinaFace, which only annotated facial keypoints on the original training set. Therefore, when training the pose model, the training part of the original WIDERFACE train dataset is divided into training/validation datasets in an 8:2 ratio, and the val dataset is evaluated after training is completed.

	ARCH	GFLOPs	Easy	Medium	Hard
zjykzj/YOLO5Face	yolov5n-v7.0	4.2	93.25	91.11	80.33
zjykzj/YOLO5Face	yolov5s-v7.0	15.8	94.84	93.28	84.67

zjykzj/YOLO11Face	yolov8n-pose	8.3	94.61	92.46	80.98
zjykzj/YOLO11Face	yolov8s-pose	29.4	95.50	93.95	84.65

zjykzj/YOLO11Face	yolo11n-pose	6.6	94.62	92.56	81.02
zjykzj/YOLO11Face	yolo11s-pose	22.3	95.72	94.19	85.24

During the eval phase, using VGA resolution input images (the longer edge of the input image is scaled to 640, and the shorter edge is scaled accordingly)

News🚀

2025/03/01: Training and evaluation of WIDERFACE using the detection and pose model architecture of yolov5/yolov8/yolo11.
2025/02/21: Upgrade the baseline version of the repository to ultralytics v8.3.75.
2025/02/15: Trains a face and landmarks detector based on YOLOv8-pose and the WIDERFACE dataset.
2023/02/03: Trains a face detector based on YOLOv8 and the WIDERFACE dataset.
2025/01/09: Initialize this repository using ultralytics v8.2.103.

Background🏷

According to the YOLO5Face implementation, it adds Landmarks-HEAD to YOLOv5 to achieve synchronous detection of faces and keypoints. The YOLOv8/YOLO11 is an upgraded version of YOLOv5, which naturally improves the performance of face and keypoint detection by combining YOLO5Face and YOLOv8/YOLO11.

Through experiments, it was found that using YOLOv8-pose/YOLO11-pose can simultaneously detect faces and facial keypoints. Thank to ultralytics !!!

Note: the latest implementation of YOLO11Face in our warehouse is entirely based on ultralytics/ultralytics v8.3.75

Installation

See INSTALL.md

Usage✨

Train⭐

$ python3 pose_train.py --model yolo11s-pose.pt --data ./yolo11face/cfg/datasets/widerface-landmarks.yaml --epochs 300 --imgsz 800 --batch 8 --device 0

Eval⭐

# python pose_widerface.py --model yolo11s-pose_widerface.pt --source ../datasets/widerface/images/val/ --folder_pict ../datasets/widerface/wider_face_split/wider_face_val_bbx_gt.txt --save_txt true --imgsz 640 --conf 0.001 --iou 0.6 --max_det 1000 --batch 1 --device 7
args: Namespace(data=None, device=[7], folder_pict='../datasets/widerface/wider_face_split/wider_face_val_bbx_gt.txt', model='yolo11s-pose_widerface.pt', source='../datasets/widerface/images/val/') - unknown: ['--save_txt', 'true', '--imgsz', '640', '--conf', '0.001', '--iou', '0.6', '--max_det', '1000', '--batch', '1']
{'model': 'yolo11s-pose_widerface.pt', 'data': None, 'device': [7], 'source': '../datasets/widerface/images/val/', 'folder_pict': '../datasets/widerface/wider_face_split/wider_face_val_bbx_gt.txt', 'save_txt': True, 'imgsz': 640, 'conf': 0.001, 'iou': 0.6, 'max_det': 1000, 'batch': 1, 'mode': 'predict'}
3226

Ultralytics 8.3.75 🚀 Python-3.8.19 torch-1.12.1+cu113 CUDA:7 (NVIDIA GeForce RTX 3090, 24268MiB)
YOLO11s-pose summary (fused): 257 layers, 9,700,560 parameters, 0 gradients, 22.3 GFLOPs
...
...
Speed: 2.0ms preprocess, 14.4ms inference, 1.4ms postprocess per image at shape (1, 3, 640, 448)
Results saved to /data/zj/YOLO11Face/runs/detect/predict3
0 label saved to /data/zj/YOLO11Face/runs/detect/predict3/labels
# cd widerface_evaluate/
# python3 evaluation.py -p ../runs/detect/predict3/labels/ -g ./ground_truth/
Reading Predictions : 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 61/61 [00:00<00:00, 115.26it/s]
Processing easy: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 61/61 [00:19<00:00,  3.20it/s]
Processing medium: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 61/61 [00:18<00:00,  3.22it/s]
Processing hard: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 61/61 [00:18<00:00,  3.21it/s]
==================== Results ====================
Easy   Val AP: 0.9572097672239526
Medium Val AP: 0.9419027051471077
Hard   Val AP: 0.8523522955677869
=================================================

Predict⭐

# python3 pose_predict.py --model yolo11s-pose_widerface.pt --source ./yolo11face/assets/widerface_val/ --imgsz 640 --device 0
args: Namespace(data=None, device=[0], model='yolo11s-pose_widerface.pt', source='./yolo11face/assets/widerface_val/') - unknown: ['--imgsz', '640']

Ultralytics 8.3.75 🚀 Python-3.8.19 torch-1.12.1+cu113 CUDA:0 (NVIDIA GeForce RTX 3090, 24268MiB)
YOLO11s-pose summary (fused): 257 layers, 9,700,560 parameters, 0 gradients, 22.3 GFLOPs
image 1/2 /data/zj/YOLO11Face/yolo11face/assets/widerface_val/39_Ice_Skating_iceskiing_39_351.jpg: 640x640 3 faces, 22.8ms
image 2/2 /data/zj/YOLO11Face/yolo11face/assets/widerface_val/9_Press_Conference_Press_Conference_9_632.jpg: 640x640 1 face, 22.8ms
Speed: 3.1ms preprocess, 22.8ms inference, 1.8ms postprocess per image at shape (2, 3, 640, 640)
Results saved to /data/zj/YOLO11Face/runs/detect/predict10

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
.github		.github
docker		docker
docs		docs
examples		examples
tests		tests
ultralytics		ultralytics
widerface_evaluate		widerface_evaluate
yolo11face		yolo11face
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
detect_eval.py		detect_eval.py
detect_predict.py		detect_predict.py
detect_train.py		detect_train.py
detect_widerface.py		detect_widerface.py
detect_widerface2yolo.py		detect_widerface2yolo.py
mkdocs.yml		mkdocs.yml
pose_eval.py		pose_eval.py
pose_predict.py		pose_predict.py
pose_train.py		pose_train.py
pose_widerface.py		pose_widerface.py
pose_widerface2yolo.py		pose_widerface2yolo.py
pyproject.toml		pyproject.toml
yolo11face_utils.py		yolo11face_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents✨

News🚀

Background🏷

Installation

Usage✨

Train⭐

Eval⭐

Predict⭐

Maintainers🔥

Thanks♥️

Contributing🌞

License✒️

About

Releases 3

Packages

Languages

License

zjykzj/YOLO11Face

Folders and files

Latest commit

History

Repository files navigation

Table of Contents✨

News🚀

Background🏷

Installation

Usage✨

Train⭐

Eval⭐

Predict⭐

Maintainers🔥

Thanks♥️

Contributing🌞

License✒️

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages