1

I want to run my own dataset and want to create the XML files by myself since there are too many and it is not possible with the labelimg software.
I already have the coordinates of bonding boxes and have written a script by which I create the XML files.

Here is an example of the created XML files:
I have around 400 training images - each with 20-30 objects in it. But after running the code of "xml_to_csv.py", I get the following error (in the attachment also). While it is not there when I run it with a labelimg-created XML.

This is the error:

enter image description here

This is my generated XML file:

This is my generated XML file:

And this is in the continue :

enter image description here4

1 Answers1

4

It's better to use a safe way to generate LabelImg xml files which may reduce possible chance of data format error. I found this ways useful for me to avoid error due to xml standard.

import numpy as np
from pathlib import Path
import xml.etree.cElementTree as ET
from PIL import Image


def create_labimg_xml(image_path, annotation_list):

    image_path = Path(image_path)
    img = np.array(Image.open(image_path).convert('RGB'))

    annotation = ET.Element('annotation')
    ET.SubElement(annotation, 'folder').text = str(image_path.parent.name)
    ET.SubElement(annotation, 'filename').text = str(image_path.name)
    ET.SubElement(annotation, 'path').text = str(image_path)

    source = ET.SubElement(annotation, 'source')
    ET.SubElement(source, 'database').text = 'Unknown'

    size = ET.SubElement(annotation, 'size')
    ET.SubElement(size, 'width').text = str (img.shape[1])
    ET.SubElement(size, 'height').text = str(img.shape[0])
    ET.SubElement(size, 'depth').text = str(img.shape[2])

    ET.SubElement(annotation, 'segmented').text = '0'

    for annot in annotation_list:
        tmp_annot = annot.split(',')
        cords, label = tmp_annot[0:-2], tmp_annot[-1]
        xmin, ymin, xmax, ymax = cords[0], cords[1], cords[4], cords[5]

        object = ET.SubElement(annotation, 'object')
        ET.SubElement(object, 'name').text = label
        ET.SubElement(object, 'pose').text = 'Unspecified'
        ET.SubElement(object, 'truncated').text = '0'
        ET.SubElement(object, 'difficult').text = '0'

        bndbox = ET.SubElement(object, 'bndbox')
        ET.SubElement(bndbox, 'xmin').text = str(xmin)
        ET.SubElement(bndbox, 'ymin').text = str(ymin)
        ET.SubElement(bndbox, 'xmax').text = str(xmax)
        ET.SubElement(bndbox, 'ymax').text = str(ymax)

    tree = ET.ElementTree(annotation)
    xml_file_name = image_path.parent / (image_path.name.split('.')[0]+'.xml')
    tree.write(xml_file_name)


# --------------------------------------------------------------------------------
# a quadrilateral bounding box(8 points) coordinate example
anotation_list = ['291,473,385,481,383,504,289,496,Hello',
                  '270,507,330,507,330,516,270,516,SUPERLATIVE']

create_labimg_xml('data/demo.jpg', anotation_list)
Uzzal Podder
  • 2,925
  • 23
  • 26