0

We have a postgres-DB-dump where all kinds of files (png, jpg, pdf etc..) are stored as binary data in bytea format. We'd have to read them out and store them as a file (e.g. .jpg) on the server using grooy or java.

The problem is at the moment that the files, whichare are created, are corrupted (altough the same code worked in an earlier db-dump on another server). No valid files are created.

data in db-column with sql-query looks like this: \x1f8b0800000000000000ed5adb6edb38107ddfaf10f45a14be245d20852320698316089214a9fb2c50d2d8264c912e49b5f57ecffec9fed892badf68c9 ... 2048 bytes per row. some files spread over multiple rows..

//Get data from DB: binary data, mime_type, file_ID
    GroovyRowResult[] data = sql.rows("""
      SELECT lo.loid loid, lo.pageno pageno, lo.data d, nb.hash  hash, nb.size size, nb.mime_type mime_type,
      nb.data nb_loid, nb.file_extension ext FROM <tablename containing binary data> lo
      LEFT JOIN <tablename helptable metainfo> nb ON lo.loid = nb.data
      WHERE lo.loid = $loid
      """)

    def fileId = data[0].loid.toString()
    def mimeType = data[0].mime_type.toString()
    def temp = []

    //read out binary data into a byte array
    data.each { GroovyRowResult row ->
      temp << row.d
    }
    //read the byte array into an input stream
    InputStream inputStream = new ByteArrayInputStream((byte[]) temp.flatten(), 0, temp.flatten().size())

    def systemSettings = SystemSettings.getInstance(application)

    def newFileId = FileInBinaryFormat.detectTypeAndSave(<folderID on Server>, application.getRealPath(systemSettings.getDocumentPath()), fileId, mimeType, inputStream)

which calls the following method, after having identified the mime-type from metainformation in db: FileEntry represents an uploaded document on our server.

  static String saveAsImage(InputStream inputStream, String fileName, Folder folder, String path, Connection conn, String fileExtension) {

    BufferedImage bufferedImage = null
    try {
      bufferedImage = ImageIO.read(inputStream)
   } catch (IOException e){
      ...
    }

    try{ 
      def newFile = new File(path,"${UniqueID.createHexId()}"+fileExtension)
      ImageIO.write(bufferedImage, fileExtension, newFile)
        def fileEntry = FileEntry.createNewEntry(newFile, conn, folder, Customer.getCustomer(<CustomerNumber>),
        "", fileName + fileExtension)
      return fileEntry.id   
    } catch (IOException e) {
      ...
    } 
  }

Does anybody see some obvious mistakes causing the Problem or can help me in any way? Many thanks in Advance!

EDIT: Some more information:

When one logs "newFile", it shows a path to a file. However there is no file found on the server in the directory. so no file is created.

The log of the buffered image looks like this:

[Date:Time] BufferedImage@57c0c3cb: type = 5 ColorModel: #pixelBits = 24 numComponents = 3 color space = java.awt.color.ICC_ColorSpace@6bcd3bb4 transparency = 1 has alpha = false isAlphaPre = false ByteInterleavedRaster: width = 277 height = 368 #numDataElements 3 dataOff[0] = 2

The byte array is an array in the form: [-1, 125, 5, 6, ..., ] (signed 2 complement of hex values in DB column). So this seems to work.

I added "Order BY pageno" in the query in order to make sure the ordering of the rows is correct.

  • What does "corrupted" or "no valid files are created" mean? If you usually have to stitch together multiple rows, how is the proper order established? – cfrick May 03 '21 at 15:06
  • https://stackoverflow.com/questions/33107318/how-to-download-bytea-column-as-file-using-java ? – Matias Bjarland May 03 '21 at 15:16
  • @cfrick thanks for asking. there are pagenumbers for multiple rows. so the order is the ascending order of these pagenumbers. so i think this is not the problem. "corrupted" means there is kind of an "empty" file entry appearing in the file system on the server, it has a name and ending jpg., but an error message is standing there saying: "file is not on the server" and when trying to open it (its a remote server i acces over a gui over a browser) the browser finds no file under the specified URI. – FritzKuenkel May 03 '21 at 15:33
  • Is it "kind of empty" or is it empty? Have you checked the size of the file on the server? Have you checked the content with a tool, that can handle binary content? Do those catch blocks actually contain sane handlers, that would show you, that something went wrong? Also if you'd use `ORDER BY` in your query, then you no longer have to rely on the fact, that your DB order is just fine. – cfrick May 03 '21 at 15:47
  • i'd say its empty. theres no size of the file specified in the list (as opposed to files, where everything had worked fine). so either the server is not able to detect the file size or its in fact empty. the catch blocks only wrote an error message to the console, nothing more. the content are mainly profile pictures. the temp.flatten() array is the hex-numbers from the db correctly converted to decimal format in signed 2-complement nb's, so ff is -1 e.g. looks like [-1, 34, 19, -156, ...] . so the reading out of the data seems to work correctly. thanks for the hint concerning ORDER BY. – FritzKuenkel May 03 '21 at 16:00
  • If they wrote an error message, what was it? – cfrick May 03 '21 at 16:11
  • @cfrick there is no error thrown. the catch blocks are not reached. what i meant was they would log a message and do nothing more, IF reached, so nothing substantial happens there. – FritzKuenkel May 03 '21 at 16:22
  • Well that leaves the last point then: if it worked last time, what changed? – cfrick May 03 '21 at 16:32
  • what changed was that there is another instance of the database now. a more recent copy of all the data of the client was made in order to get the latest state of the data before going live. so therese a new data dump, but with an identical schema and database-software as before. and a new server instance was set up, where the files are to be stored. – FritzKuenkel May 03 '21 at 16:42

1 Answers1

0

I found the solution. The Problem was in the ImageIO.write(bufferedImage, fileExtension, newFile) secion. the variable 'fileExtension' included a dot : ".jpeg", ".png" etc. It need to be without a dot. So after changeing that and changeing the row above to: def newFile = new File(path,UniqueID.createHexId()+"."+fileExtension) did the job.