Last updated:
0 purchases
boxmot 10.0.84
BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
Introduction
This repo contains a collections of pluggable state-of-the-art multi-object trackers for segmentation, object detection and pose estimation models. For the methods using appearance description, both heavy (CLIPReID) and lightweight state-of-the-art ReID models (LightMBN, OSNet and more) are available for automatic download. We provide examples on how to use this package together with popular object detection models such as: YOLOv8, YOLOv9 and YOLOv10
Tracker
Status
HOTA↑
MOTA↑
IDF1↑
Strongsort
✅
48.153
38.936
56.834
Ocsort
✅
46.841
39.395
54.692
Deepocsort
✅
46.517
39.647
54.104
Bytetrack
✅
45.757
38.753
53.284
Botsort
✅
45.128
39.323
51.563
Imprassoc
✅
44.664
39.571
50.264
Hybridsort
❌
NOTES: performed on the second half of the MOT17 train set. The detector used is Yolov8x (imgsz 1280, conf 0.2). Each tracker is configured with its original parameters found in their respective official repository.
3rd party resources
ROS wrapper for boxmot (link to external repository)
Yolov10 Integration with BoxMOT (link to external Notebook)
Yolov8 training (link to external repository)
ReID model training (link to external repository)
ReID model inference acceleration with Nebullvm (link to external Notebook)
Experiments
In inverse chronological order:
Evaluation of the params evolved for first half of MOT17 on the complete MOT17
Segmentation model vs object detetion model on MOT metrics
Effect of masking objects before feature extraction
conf-thres vs HOTA, MOTA and IDF1
Effect of KF updates ahead for tracks with no associations on MOT17
Effect of full images vs 1280 input to StrongSORT on MOT17
Effect of different OSNet architectures on MOT16
Yolov5 StrongSORT vs BoTSORT vs OCSORT
Yolov5 BoTSORT branch: https://github.com/mikel-brostrom/Yolov5_StrongSORT_OSNet/tree/botsort
Yolov5 StrongSORT OSNet vs other trackers MOT17
StrongSORT MOT16 ablation study
Yolov5 StrongSORT OSNet vs other trackers MOT16 (deprecated)
News
Added YOLOv9 and YOLOv10 support
Hyperparameter search for all trackers using RayTune (July 2024)
Added Improved Association Pipeline Tracker by rolson24 in #1527 (July 2024)
Why BOXMOT?
Today's multi-object tracking options are heavily dependant on the computation capabilities of the underlaying hardware. BoxMOT provides a great variety of tracking methods that meet different hardware limitations, all the way from CPU only to larger GPUs. Morover, we provide scripts for ultra fast experimentation by saving detections and embeddings, which then be loaded into any tracking algorithm. Avoiding the overhead of repeatedly generating this data.
Installation
Start with Python>=3.9 environment.
If you want to run the YOLOv8, YOLOv9 or YOLOv10 examples:
git clone https://github.com/mikel-brostrom/boxmot.git
cd boxmot
pip install poetry
poetry install --with yolo # installed boxmot + yolo dependencies
poetry shell # activates the newly created environment with the installed dependencies
but if you only want to import the tracking modules you can simply:
pip install boxmot
YOLOv8 | YOLOv9 | YOLOv10 examples
Tracking
Yolo models
$ python tracking/track.py --yolo-model yolov10n # bboxes only
python tracking/track.py --yolo-model yolov9s # bboxes only
python tracking/track.py --yolo-model yolov8n # bboxes only
yolov8n-seg # bboxes + segmentation masks
yolov8n-pose # bboxes + pose estimation
Tracking methods
$ python tracking/track.py --tracking-method deepocsort
strongsort
ocsort
bytetrack
botsort
imprassoc
Tracking sources
Tracking can be run on most video formats
$ python tracking/track.py --source 0 # webcam
img.jpg # image
vid.mp4 # video
path/ # directory
path/*.jpg # glob
'https://youtu.be/Zgi9g1ksQHc' # YouTube
'rtsp://example.com/media.mp4' # RTSP, RTMP, HTTP stream
Select ReID model
Some tracking methods combine appearance description and motion in the process of tracking. For those which use appearance, you can choose a ReID model based on your needs from this ReID model zoo. These model can be further optimized for you needs by the reid_export.py script
$ python tracking/track.py --source 0 --reid-model lmbn_n_cuhk03_d.pt # lightweight
osnet_x0_25_market1501.pt
mobilenetv2_x1_4_msmt17.engine
resnet50_msmt17.onnx
osnet_x1_0_msmt17.pt
clip_market1501.pt # heavy
clip_vehicleid.pt
...
Filter tracked classes
By default the tracker tracks all MS COCO classes.
If you want to track a subset of the classes that you model predicts, add their corresponding index after the classes flag,
python tracking/track.py --source 0 --yolo-model yolov8s.pt --classes 16 17 # COCO yolov8 model. Track cats and dogs, only
Here is a list of all the possible objects that a Yolov8 model trained on MS COCO can detect. Notice that the indexing for the classes in this repo starts at zero
Evaluation
Evaluate a combination of detector, tracking method and ReID model on standard MOT dataset or you custom one by
$ python3 tracking/val.py --benchmark MOT17-mini --yolo-model yolov8n.pt --reid-model osnet_x0_25_msmt17.pt --tracking-method deepocsort --verbose --source ./assets/MOT17-mini/train
$ python3 tracking/val.py --benchmark MOT17 --yolo-model yolov8n.pt --reid-model osnet_x0_25_msmt17.pt --tracking-method ocsort --verbose --source ./tracking/val_utils/MOT17/train
Detections and embeddings are stored for the selected YOLO and ReID model respectively, which then be loaded into any tracking algorithm. Avoiding the overhead of repeatedly generating this data.
Evolution
We use a fast and elitist multiobjective genetic algorithm for tracker hyperparameter tuning. By default the objectives are: HOTA, MOTA, IDF1. Run it by
# saves dets and embs under ./runs/dets_n_embs separately for each selected yolo and reid model
$ python tracking/generate_dets_n_embs.py --source ./assets/MOT17-mini/train --yolo-model yolov8n.pt yolov8s.pt --reid-model weights/osnet_x0_25_msmt17.pt
# evolve parameters for specified tracking method using the selected detections and embeddings generated in the previous step
$ python tracking/evolve.py --benchmark MOT17-mini --dets yolov8n --embs osnet_x0_25_msmt17 --n-trials 9 --tracking-method botsort
The set of hyperparameters leading to the best HOTA result are written to the tracker's config file.
Export
We support ReID model export to ONNX, OpenVINO, TorchScript and TensorRT
# export to ONNX
$ python3 boxmot/appearance/reid_export.py --include onnx --device cpu
# export to OpenVINO
$ python3 boxmot/appearance/reid_export.py --include openvino --device cpu
# export to TensorRT with dynamic input
$ python3 boxmot/appearance/reid_export.py --include engine --device 0 --dynamic
The set of hyperparameters leading to the best HOTA result are written to the tracker's config file.
Custom tracking examples
Detection
import cv2
import numpy as np
from pathlib import Path
from boxmot import DeepOCSORT
tracker = DeepOCSORT(
model_weights=Path('osnet_x0_25_msmt17.pt'), # which ReID model to use
device='cuda:0',
fp16=False,
)
vid = cv2.VideoCapture(0)
while True:
ret, im = vid.read()
# substitute by your object detector, output has to be N X (x, y, x, y, conf, cls)
dets = np.array([[144, 212, 578, 480, 0.82, 0],
[425, 281, 576, 472, 0.56, 65]])
# Check if there are any detections
if dets.size > 0:
tracker.update(dets, im) # --> M X (x, y, x, y, id, conf, cls, ind)
# If no detections, make prediction ahead
else:
dets = np.empty((0, 6)) # empty N X (x, y, x, y, conf, cls)
tracker.update(dets, im) # --> M X (x, y, x, y, id, conf, cls, ind)
tracker.plot_results(im, show_trajectories=True)
# break on pressing q or space
cv2.imshow('BoxMOT detection', im)
key = cv2.waitKey(1) & 0xFF
if key == ord(' ') or key == ord('q'):
break
vid.release()
cv2.destroyAllWindows()
Pose & segmentation
import cv2
import numpy as np
from pathlib import Path
from boxmot import DeepOCSORT
tracker = DeepOCSORT(
model_weights=Path('osnet_x0_25_msmt17.pt'), # which ReID model to use
device='cuda:0',
fp16=True,
)
vid = cv2.VideoCapture(0)
while True:
ret, im = vid.read()
keypoints = np.random.rand(2, 17, 3)
mask = np.random.rand(2, 480, 640)
# substitute by your object detector, input to tracker has to be N X (x, y, x, y, conf, cls)
dets = np.array([[144, 212, 578, 480, 0.82, 0],
[425, 281, 576, 472, 0.56, 65]])
tracks = tracker.update(dets, im) # --> M x (x, y, x, y, id, conf, cls, ind)
# xyxys = tracks[:, 0:4].astype('int') # float64 to int
# ids = tracks[:, 4].astype('int') # float64 to int
# confs = tracks[:, 5]
# clss = tracks[:, 6].astype('int') # float64 to int
inds = tracks[:, 7].astype('int') # float64 to int
# in case you have segmentations or poses alongside with your detections you can use
# the ind variable in order to identify which track is associated to each seg or pose by:
# masks = masks[inds]
# keypoints = keypoints[inds]
# such that you then can: zip(tracks, masks) or zip(tracks, keypoints)
# break on pressing q or space
cv2.imshow('BoxMOT segmentation | pose', im)
key = cv2.waitKey(1) & 0xFF
if key == ord(' ') or key == ord('q'):
break
vid.release()
cv2.destroyAllWindows()
Tiled inference
from sahi import AutoDetectionModel
from sahi.predict import get_sliced_prediction
import cv2
import numpy as np
from pathlib import Path
from boxmot import DeepOCSORT
tracker = DeepOCSORT(
model_weights=Path('osnet_x0_25_msmt17.pt'), # which ReID model to use
device='cpu',
fp16=False,
)
detection_model = AutoDetectionModel.from_pretrained(
model_type='yolov8',
model_path='yolov8n.pt',
confidence_threshold=0.5,
device="cpu", # or 'cuda:0'
)
vid = cv2.VideoCapture(0)
color = (0, 0, 255) # BGR
thickness = 2
fontscale = 0.5
while True:
ret, im = vid.read()
# get sliced predictions
result = get_sliced_prediction(
im,
detection_model,
slice_height=256,
slice_width=256,
overlap_height_ratio=0.2,
overlap_width_ratio=0.2
)
num_predictions = len(result.object_prediction_list)
dets = np.zeros([num_predictions, 6], dtype=np.float32)
for ind, object_prediction in enumerate(result.object_prediction_list):
dets[ind, :4] = np.array(object_prediction.bbox.to_xyxy(), dtype=np.float32)
dets[ind, 4] = object_prediction.score.value
dets[ind, 5] = object_prediction.category.id
tracks = tracker.update(dets, im) # --> (x, y, x, y, id, conf, cls, ind)
tracker.plot_results(im, show_trajectories=True)
# break on pressing q or space
cv2.imshow('BoxMOT tiled inference', im)
key = cv2.waitKey(1) & 0xFF
if key == ord(' ') or key == ord('q'):
break
vid.release()
cv2.destroyAllWindows()
Contributors
Contact
For Yolo tracking bugs and feature requests please visit GitHub Issues.
For business inquiries or professional support requests please send an email to: [email protected]
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.