2

I have a requirement to extract image data from an XML file and output the images as separate files.

I've handled the parsing, but I'm at a loss on how to convert to an image.

The XML looks something like this:

<Bitmap>
    <BitmapInfo BitWidth="40" BitHeight="40" ByteWidth="8" BitCount="1" ColorCount="2" Compression="true">
        <ColorTable>
            <Color>0</Color>
            <Color>16777215</Color>
        </ColorTable>
        <BitData>Af5/+/8B/h/7/wH+B/v/Af4B+/8C/gB//P8C/gAf/P8C/gAH/P8C/gAB/P8D/gAAf/3/A/4AAB/9/wP+AAAH/f8D/gAAAf3/AP7+AAB//v8A/v4AAB/+/wD+/gAAB/7/AP7+AAAD/v8A/v4AAAf+/wD+/gAAH/7/AP7+AAB//v8D/gAAAf3/A/4AAAf9/wP+AAAf/f8D/gAAf/3/Av4AAfz/Av4AB/z/Av4AH/z/Av4Af/z/Af4B+/8B/gf7/wH+H/v/Af5/s/8=</BitData>
    </BitmapInfo>
    <Area Left="4430000" Top="12690000" Right="4563333" Bottom="12823333" />
</Bitmap>

Another example:

<Bitmap>
    <BitmapInfo BitWidth="24" BitHeight="14" ByteWidth="4" BitCount="1" ColorCount="2" Compression="true">
        <ColorTable>
            <Color>0</Color>
            <Color>16777215</Color>
        </ColorTable>
        <BitData>/f8u8+c5//PnOf/z5hn/8+bZ//Pm2f/z5Mn/8+Xp//Pl6f/z4eH/8+Px/4Bj8f8AM/n8/w==</BitData>
    </BitmapInfo>
    <Area Left="1043333" Top="13360000" Right="1123333" Bottom="13406667" />
</Bitmap>

Any pointers on how to go about doing this would help.

Asish M.
  • 2,588
  • 1
  • 16
  • 31
  • 1
    check out Pillow (it's the replacement for PIL). – Josep Valls Dec 21 '18 at 18:59
  • @JosepValls Thanks, but I can't find anything in pillow/PIL that helps with the compression and bytewidth – Asish M. Dec 21 '18 at 19:56
  • What is the compression algorithm of the image data and the encoding of the bit data string? – hlg Dec 28 '18 at 07:54
  • I don't know the compression algorithm of the image data. The bit data string seems to be base64 encoded. – Asish M. Dec 28 '18 at 13:14
  • This may not be helpful, (but then again, it maybe), but do you know about [prexview on github](https://github.com/prexview/prexview-python)? It seems to me, they are doing what you are looking for, You could look at the python or javascript implementation for pointers to help you with writing your own "XML to Image" code. – Duck Dodgers Dec 28 '18 at 16:03
  • Another option could be the 2 step approach of first an XML-to-SVG conversion and then SVG-to-image conversion, as suggested in the discussion on these two questions: [xml to svg](https://stackoverflow.com/questions/33243010/how-to-convert-xml-into-jpg-or-png-using-javascript) and [svg to image](https://stackoverflow.com/questions/3975499/convert-svg-to-image-jpeg-png-etc-in-the-browser?noredirect=1&lq=1) – Duck Dodgers Dec 28 '18 at 16:08
  • 1
    What do fields like *ByteWidth*, represent? (I assume *BitWidth*, *BitHeight*, *BitCount* are *2D* sizes (in pixels), and color depth). Same thing about *Area* coordinates. Also why both *Java* and *Python* tags? – CristiFati Dec 29 '18 at 13:37
  • this is not really something that SO can help with, it's way too broad without a clearer specification. That said, this is probably the BMP format translated to XML, see https://en.wikipedia.org/wiki/BMP_file_format. The attributes in the `BitmapInfo` tag map directly to aspects of the BMP format. – Martijn Pieters Dec 31 '18 at 11:20

1 Answers1

1

In Java, using javax.imageio, you could do something like this:

public void writeImageFile(String data, int imgWidth, int imgHeight, int byteWidth, int bitCount, int colorCount, int[] colors, String fileName) throws IOException {
    byte[] interleavedRGB = getColorTable(colors);
    new IndexColorModel(bitCount, colorCount, interleavedRGB, 0, false);
    BufferedImage buffImg = new BufferedImage(imgWidth, imgHeight, BufferedImage.TYPE_BYTE_INDEXED);
    WritableRaster raster = buffImg.getRaster();
    byte[] uncompressed = uncompress(decodeBase64(data), byteWidth);
    int pixels = imgHeight * imgWidth;
    for (int pos = 0; pos < pixels; pos += bitCount) {
        int y = pos % imgWidth;
        int x = pos / imgHeight;
        raster.setSample(x, y, 0, getValue(uncompressed, pos, bitCount));
    }
    ImageIO.write(buffImg, "png", new File(fileName));
}

You would need to convert the color table to extract the RGB values. These look like 8bit per channel RGB values without alpha. The method could look like

private byte[] getColorTable(int[] colors) {
    int colorCount = colors.length;
    byte[] interleavedRGB = new byte[colorCount * 3];
    for (int i = 0; i < colorCount - 1; i++) {
        interleavedRGB[i * 3] = (byte) ((colors[i] & 0xFF0000) >> 16);
        interleavedRGB[i * 3 + 1] = (byte) ((colors[i] & 0x00FF00) >> 8);
        interleavedRGB[i * 3 + 2] = (byte) (colors[i] & 0x0000FF);
    }
    return interleavedRGB;
}

Also, you would need to implement methods to decode and uncompress the data as well as one to get a value (color index) at a specific position from the uncompressed data.

abstract byte[] decodeBase64(String encoded);
abstract byte[] uncompress(byte[] compressed, int byteWidth);
abstract int getValue(byte[] uncompressed, int pos, int bitCount);

If you do not have any documentation, then the uncompress method in particular requires some reverse engineering. Looking at the decoded data from these two examples it is likely some run length encoded value that uses the byteWidth. There are several RLE variants, I was not able to derive image data that appears meaningful to me, but I don't have any knowledge about the intent of the image data, so I may not be able to recognize properly uncompressed data.

One thing I realized for the first example is that the data is 1600 bit long. Given an image width and height of 40 respectively and a color depth of 1 bit, this could as well be uncompressed, even though the XML says it would be compressed.

hlg
  • 1,321
  • 13
  • 29