I would like to train a yolo model with the COCO dataset. As there are more than 80 classes, how can I filter it? I just need the classes person and car.
4 Answers
For easy and simple way, follow these steps :
- Modify (or copy for backup) the
coco.names
file indarknet\data\coco.names
- Delete all other classes except person and car
- Modify your cfg file (e.g.
yolov3.cfg
), change the 3 classes on line 610, 696, 783 from 80 to 2 - Change the 3 filters in cfg file on line 603, 689, 776 from 255 to (classes+5)x3 = 21
- Run the detector
./darknet detector test cfg/coco.data cfg/yolov3.cfg yolov3.weights data/person.jpg
For more advance way you can use this repo to create yolo datasets based on voc, coco or open images. https://github.com/holger-prause/yolo_utils
Also refer to this : How can I download a specific part of Coco Dataset?

- 3,981
- 5
- 35
- 61
You can use the PyCoco API to work with the COCO dataset. With this library, filtering classes from the dataset is so easy!
# Define the classes (out of the 81) which you want to see. Others will not be shown.
filterClasses = ['person', 'dog']
# Fetch class IDs only corresponding to the filterClasses
catIds = coco.getCatIds(catNms=filterClasses)
# Get all images containing the above Category IDs
imgIds = coco.getImgIds(catIds=catIds)
print("Number of images containing all the classes:", len(imgIds))
# load and display a random image
img = coco.loadImgs(imgIds[np.random.randint(0,len(imgIds))])[0]
I = io.imread('{}/images/{}/{}'.format(dataDir,dataType,img['file_name']))/255.0
I have recently written an entire post on exploring and manipulating the COCO dataset. Do have a look.

- 121
- 3
-
How do you import pycocotools? – Rylan Schaeffer Oct 21 '21 at 20:35
-
I just tried `filterClasses = ['person', 'dog'] catIds = coco.getCatIds(catNms=filterClasses)` and got `KeyError: 'categories'` – Rylan Schaeffer Oct 21 '21 at 20:42
The easiest way to do this these days is to use fiftyone
which is recommended on the COCO website to download, visualize, and evaluate the dataset including any subset of classes.
import fiftyone as fo
import fiftyone.zoo as foz
#
# Only the required images will be downloaded (if necessary).
# By default, only detections are loaded
#
dataset = foz.load_zoo_dataset(
"coco-2017",
splits=["validation","train"],
classes=["person", "car"],
# max_samples=50,
)
# Visualize the dataset in the FiftyOne App
session = fo.launch_app(dataset)
You can also use it to convert the dataset to YOLO format and to train models directly on the dataset.
# Export the dataset in YOLO format
export_dir = "/path/for/yolov5-dataset"
label_field = "ground_truth"
dataset.export(
export_dir=export_dir,
dataset_type=fo.types.YOLOv5Dataset,
label_field=label_field,
)
To install:
pip install fiftyone

- 504
- 2
- 7
The only way you have to filter classes without retrain model on Coco dataset is to make a check on detection output to avoid to draw a box for useless classes, but the model will continue to detect all classes in background.