0

I'm working on a Word file manipulator (DOCX format to be specific) and it is working fine but at this phase I'm expected to take a file from SAP software, I take the file in the form of bytes that look something like 504B030414000600080000002100DFA4D26C5A0100002005000013000.

However I try to use this code to read the bytes received, put them in an input stream and open them with Apache POI's functions:

byte[] byteArr = "504B030414000600080000002100DFA4D26C5A01000020050000130008025B436F6E74656E745F54797065735D2E786D6C20A2040228A0000200000000000000".getBytes();
InputStream fis = new ByteArrayInputStream(byteArr);
return new XWPFDocument(OPCPackage.open(fis));

The last line brings me an error that the file gives isn't OOXML.

How to transform my received bytes to something relevant in Java?

Sandra Rossi
  • 11,934
  • 5
  • 22
  • 48

1 Answers1

1

Using getBytes is for the String type. Because this is hexadecimal, you will have to use DatatypeConverter.parseHexBinary.

This question has more information, and even more options to choose from:
Convert a string representation of a hex dump to a byte array using Java?


Now, having said that, I have not been able to convert the hex string provided from your question into a good document.

Running this function:

    try (final FileOutputStream fos = new FileOutputStream(new File("C:/", "Test Document.docx")))
    {
        final byte[] b = DatatypeConverter.parseHexBinary(
                "504B030414000600080000002100DFA4D26C5A01000020050000130008025B436F6E74656E745F54797065735D2E786D6C20A2040228A0000200000000000000");
        fos.write(b);
    }

... results in the file below:

enter image description here

The [Content_Types].xml in there is promising (if you open other valid documents with 7-Zip you will see that in the archive). However, I cannot open this file with MS-Office, LibreOffice, or 7-Zip.

If I had to guess, I would say this particular file has become corrupted, or parts of it gone missing.

JonathanDavidArndt
  • 2,518
  • 13
  • 37
  • 49
  • Thanks for your reply , yes I do suspect something is off with the file now although i told the client to check and he says it is the same and even sent me a the file itself, So i will keep looking in this data converter since i did not know about it before. Thank you – Sherif El Nayad Aug 31 '21 at 17:44