0

I have prepared a code which fetches bytes from the database field storing image as BLOB and inserting the same into an excel file using apache poi version 3.9

This code works fine and the images are being pulled and anchored to the column and row specified in most cases.

Here is the code :

Blob img = ads.getPhoto();
      byte[] imageByte = ads.getPhoto().getBytes(1, (int) img.length());
      if (imageByte.length > 10) {
      try {
       int picId = workbook.addPicture(imageByte, workbook.PICTURE_TYPE_JPEG);
       CreationHelper helper = workbook.getCreationHelper();
       Drawing drawing = sheet.createDrawingPatriarch();
       ClientAnchor anchor = helper.createClientAnchor();
       anchor.setCol1(2);
       anchor.setCol2(3);
       anchor.setRow1(rowId);
       anchor.setRow2(rowId + 1);
       Picture pict = drawing.createPicture(anchor, picId);
       System.out.println("TEST PRINT");
     } catch (Exception ex) {
        ex.printStackTrace();
     }

Now I have narrowed down to one image that is causing an issue. The blob looks ok to me. But when all the images along with this particular image is inserted all the images are removed and the excel opens with the error message: "File Error: Data May have been lost"

None of the images in the Drawing Patriarch are displayed in the excel. If I skip this particular row the images are displayed fine in the excel. Its just with this particular image.

Can any one help me out with a way to check if there is an error in one of the drawings and skip it if found so that rest of the images stay and only this image is removed.

Any suggestions with this would help a lot. Thanks in advance.

DhanushD
  • 36
  • 5
  • Try upgrading to the most recent version of Apache POI? – Gagravarr Jul 26 '16 at 08:41
  • Upgraded to poi-3.14. The result is the same. I believe the issue is not with the code, but with the image being inserted. Is there away to know such an issue in advance and avoid such images. – DhanushD Jul 26 '16 at 09:37
  • `Workbook.addPicture` will not check whether the byte array contains valid picture data. It simply writes the given bytes into the Excel file. Seems as if Excel then has problems while rendering your bytes as an PICTURE_TYPE_JPEG. Maybe you can check first: http://stackoverflow.com/questions/4550296/how-to-identify-contents-of-a-byte-is-a-jpeg – Axel Richter Jul 26 '16 at 10:59
  • I have verified the same before finally giving up and asking here. The Bytes begin with FFD8 and ends with FFD9. Very similar to the other images in the database. Moreover that image is being rendered by HTML but in the excel something goes wrong. – DhanushD Jul 26 '16 at 12:15
  • Further if I download a jpeg from the bytes and save it locally, The image can be viewed But on manually trying to insert it into excel using Insert>Picture from file excel gives an error. So I assume there should be some basic problem with the bytes of that particular file. – DhanushD Jul 26 '16 at 12:19
  • Maybe some kind of inappropriate steganography within the bytes which a browser or image viewer ignores but Excel does not? – Axel Richter Jul 26 '16 at 13:06
  • probably, but how can that be figured out? – DhanushD Jul 27 '16 at 04:27
  • Will uploading the bytes here help in any manner? – DhanushD Jul 28 '16 at 04:34
  • Has any one else faced this issue and has found a solution to this? Please advise... – DhanushD Oct 21 '16 at 10:12
  • Turns out Switching to XSSF instead of HSSF works fine. But it is freakishly slow... The same line of code that worked in 10 seconds, took me 20mins to run via XSSF... – DhanushD Oct 21 '16 at 12:31

1 Answers1

0

Switching to XSSF worked for this scenario. Corrupt image is displayed with an invalid image icon. And all others still populate properly.

But the performance of XSSF is very poor using poi.3.15..

DhanushD
  • 36
  • 5
  • Using SXSSFWorkbook will speed things up further due to the fact that it keeps streams to the file instead of keeping in memory. SXSSF (package: org.apache.poi.xssf.streaming) is an API-compatible streaming extension of XSSF to be used when very large spreadsheets have to be produced, and heap space is limited. SXSSF achieves its low memory footprint by limiting access to the rows that are within a sliding window, while XSSF gives access to all rows in the document. Older rows that are no longer in the window become inaccessible, as they are written to the disk. – DhanushD Aug 01 '19 at 10:24