21

So I have been using this code,. I am trying to generate the raw mask of the images from COCO dataset.

dataDir='G:'
dataType='train2014'
annFile='{}/annotations/instances_{}.json'.format(dataDir,dataType)


coco=COCO(annFile)
annFile = '{}/annotations/person_keypoints_{}.json'.format(dataDir,dataType)
coco_kps=COCO(annFile)


catIds = coco.getCatIds(catNms=['person'])
imgIds = coco.getImgIds(catIds=catIds );
imgIds = coco.getImgIds(imgIds = imgIds[0])
img = coco.loadImgs(imgIds[np.random.randint(0,len(imgIds))])[0]
I = io.imread('G:/train2014/'+img['file_name'])

plt.imshow(I); plt.axis('off')
annIds = coco.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)
anns = coco.loadAnns(annIds)
coco.showAnns(anns)

But what i get is some thing like this

enter image description here

But what I want is something like this

enter image description here

How can I get the raw mask against each image ?

Farshid Rayhan
  • 1,134
  • 4
  • 17
  • 31

5 Answers5

17

The complete code wasn't in the answer so I post it below.

Please install pycocotools first.

pip install pycocotools

Import the required modules. I'm assuming you're using a jupyter notebook.

from pycocotools.coco import COCO
import os
from PIL import Image
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline

Load the annotations for the coco dataset. Here, specify the 74 image.

coco = COCO('../datasets/coco/annotations/instances_train2017.json')
img_dir = '../datasets/coco/train2017'
image_id = 74

img = coco.imgs[image_id]
# loading annotations into memory...
# Done (t=12.70s)
# creating index...
# index created!

The information of the loaded img is as follows.

img
# {'license': 2,
#  'file_name': '000000000074.jpg',
#  'coco_url': # 'http://images.cocodataset.org/train2017/000000000074.jpg',
#  'height': 426,
#  'width': 640,
#  'date_captured': '2013-11-15 03:08:44',
#  'flickr_url': # 'http://farm5.staticflickr.com/4087/5078192399_aaefdb5074_z.jpg# ',
#  'id': 74}

Display the image as follows.

image = np.array(Image.open(os.path.join(img_dir, img['file_name'])))
plt.imshow(image, interpolation='nearest')
plt.show()

enter image description here

If you want to see the overlay result:

plt.imshow(image)
cat_ids = coco.getCatIds()
anns_ids = coco.getAnnIds(imgIds=img['id'], catIds=cat_ids, iscrowd=None)
anns = coco.loadAnns(anns_ids)
coco.showAnns(anns)

enter image description here

If you just want to see the mask, as Farshid Rayhan replied, do the following:

mask = coco.annToMask(anns[0])
for i in range(len(anns)):
    mask += coco.annToMask(anns[i])

plt.imshow(mask)

enter image description here

Keiku
  • 8,205
  • 4
  • 41
  • 44
  • 1
    Defining the mask variable `mask = coco.annToMask(anns[0])` and then loping anns starting from zero would double add the first index. Doggo has value of 2 while the rest are 1. You shouldn't declare first `mask`. It is ok in the loop. – colt.exe Dec 23 '22 at 09:16
13

Following Mr Filippo intuition I was able to make the correct code, which looks something like this.

mask = coco.annToMask(anns[0])
for i in range(len(anns)):
    mask += coco.annToMask(anns[i])

plt.imshow(mask)
Farshid Rayhan
  • 1,134
  • 4
  • 17
  • 31
  • 2
    cool, glad it helped! note that this way you're generating a binary mask. Using binary `OR` would be safer in this case instead of simple addition. The idea behind multiplying the masks by the index `i` was that this way each label has a different value and you can use a colormap like the one in your image (I'm guessing it's `nipy_spectral`) to separate them in your imshow plot – filippo Jun 13 '18 at 17:12
  • Not that it matters much, but is there any reason why you keep switching the accepted answer once a day? – filippo Jun 16 '18 at 08:07
  • 1
    No !! I keep accepting it but it gets switched off again. !! Probably due to a bug in my phones browser ... whenever I open this page it gets switched ! Even my answer too ! – Farshid Rayhan Jun 16 '18 at 09:41
  • haha :-) I'm sorry! try resetting the cache or something like that – filippo Jun 16 '18 at 09:44
  • 4
    assuming mask is a numpy array, aren't you adding ann[0] twice? https://stackoverflow.com/questions/15579260/how-to-combine-multiple-numpy-masks – Javi Nov 27 '20 at 09:55
9

I'm late to the party, but if this can help someone. I don't know if your code worked for your application, however, if you want each pixel of the mask to have the value of the annotation category id, then you can't just add the masks, as some will overlapp. I used a numpy maximum for that :

cat_ids = coco.getCatIds()
anns_ids = coco.getAnnIds(imgIds=img['id'], catIds=cat_ids, iscrowd=None)
anns = coco.loadAnns(anns_ids)
anns_img = np.zeros((img['height'],img['width']))
for ann in anns:
    anns_img = np.maximum(anns_img,coco.annToMask(ann)*ann['category_id'])

EDIT : Here is an example of my code on image 47112 of the 2017 dataset : Coco2017-47112 With the code above The value of the shade of grey is the id of the category as described in the dataset description.
Note that here the pizza overlaps with the table at the edges of its polygon. If we add the masks, the overlap would be given an id corresponding to the sum of the classes of pizza and table. However, using max, only one of the class is kept. In this case, as the class table has a id greater than the id of class pizza, the overlap is affected the class table even if the pizza is visualy above. I am not sure this could be fixed easily though.

  • You are saying that your code is going to provide output like the last picture of the original question ? – Farshid Rayhan Nov 05 '19 at 17:38
  • 1
    With shades of gray yes. I've just edited my post to add an example. – Duret-Robert Louis Nov 06 '19 at 18:21
  • 1
    Isn't the problem that you mentioned significant? For example, in an image of a person standing next to a train, if the train's id is higher, it would cover the person. So your image would only have one class (train) which is the incorrect ground truth. – ashnair1 Mar 20 '21 at 09:47
3

Not familiar with COCO but I see there's a annToMask function that should generate a binary mask for each annotation.

So in untested pseudoish code, assuming non overlapping masks, you should have something like:

annIds = coco.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)

mask = np.zeros_like(img)
for i, ann in enumerate(annIds):
    mask += coco.annToMask(ann) * i 
filippo
  • 5,197
  • 2
  • 21
  • 44
2

Just adding variation of the answer, in case if you want to get the binary mask of all the annotations, it can be created as:

#Construct the binary mask
mask = coco.annToMask(anns[0])>0
for i in range(len(anns)):
     mask += coco.annToMask(anns[i])>0

plt.imshow(mask,cmap='gray')

enter image description here

sandeepsign
  • 539
  • 6
  • 11