1

The problem:

I am trying to train a YOLO v8 model using a custom dataset to detect (and track) a mouse in a video but with poor results. Can you help me improve the performances of my model?

PS: The training of the model require a quite some time, I'm asking you for tips to improve the performances so I won't waste too much time changing or optimising parameters that have little or no effect to the overall performances of the model.

Essential details:

I'm a researcher, and I'm completely new to computer vision. I am running an experiment where I need to track a mouse's movements inside a cage from a camera (fixed angle). I am trying to train a YOLO v8 model using the fiftyone.zoo dataset "open-images-v7" however this is just my approach as a novice in the field so I'm happy to follow better suggestions:

import fiftyone as fo
from ultralytics import YOLO
from pathlib import Path
from tqdm import tqdm
import shutil

# Load the FiftyOne dataset
dataset = fo.zoo.load_zoo_dataset(
    "open-images-v7",
    split="train",
    label_types=["detections"],
    classes=["Mouse"],
    max_samples=100,
)

# Convert FiftyOne dataset to YOLO format
output_dir = Path("yolo_dataset")
output_dir.mkdir(exist_ok=True)

for sample in tqdm(dataset):
    img_path = sample.filepath
    img_filename = Path(img_path).name
    yolo_labels_path = output_dir / (Path(img_filename).stem + ".txt")

    with open(yolo_labels_path, "w") as f:
        for detection in sample.ground_truth.detections:
            if detection.label == "Mouse":
                bbox = detection.bounding_box
                x, y, width, height = bbox[0], bbox[1], bbox[2], bbox[3]
                x_center = x + width / 2
                y_center = y + height / 2
                yolo_label = f"0 {x_center} {y_center} {width} {height}\n"
                f.write(yolo_label)

    # Copy image file to the YOLO dataset folder
    shutil.copy(img_path, output_dir / img_filename)

# Load a model
model = YOLO('yolov8n.pt')

# Train the model with the YOLO dataset
model.train(data='config.yaml', epochs=100, device='mps')

# Track with the model
results = model.track(source="catmouse.mov", show=True)

my config.yaml file is:

path: /home/path/to/code/folder 

train: yolo_dataset # train images (relative to 'path')
val: yolo_dataset # val images (relative to 'path')

# Classes
names:
    0: Mouse

as for the video catmouse.mov in this example is just an extract of this video from YouTube: https://youtu.be/6pbreU5ChmA. Feel free to use any other video with a mouse/mice.

Christoph Rackwitz
  • 11,317
  • 4
  • 27
  • 36
Fabio Magarelli
  • 1,031
  • 4
  • 14
  • 47
  • 1
    Do you have to use Yolo or are you flexible? If yes can you show 2 -3 frames from the video feed? I will decide to provide my solution or not based on that. – tintin98 Jul 29 '23 at 06:57
  • Hi @tintin98 I don't have to use Yolo necessarily. I can use any tech. I cannot provide the frames of the original video but it's something like this: https://img.huffingtonpost.com/asset/57753e01150000ed026c90ee.jpeg?ops=scalefit_720_noupscale&format=webp – Fabio Magarelli Jul 30 '23 at 06:34

1 Answers1

1

Obtain more data, more likely 100 examples aren't enough for a model to generalize relevant features.

It will be useful if you can take some frames from your experiment, label and add them to the dataset. Examples from the Open Images can be very different from your real data. If you cannot do this, just take more examples from the dataset.

It can be useful to enable yolo data augmentation during the training process to make the model more robust to insignificant features like angle of view, size, color, etc.

If you have enough resources you can try more complex models than v8n, for example, v8s or even v8m.

Tips for Best training results: https://docs.ultralytics.com/yolov5/tutorials/tips_for_best_training_results/?h=best

Data augmentation: https://docs.ultralytics.com/usage/cfg/#augmentation

hanna_liavoshka
  • 115
  • 1
  • 6