1

I am currently trying to solve one problem for reading and writing files in Java.

So the task is, I am directly getting the InputStream object of file which I want to read (I don't have actual file just the InputStream object) and from this object I want to recreate the original file.

I was trying to read the data from file byte by byte and write the same in anather file.

But it is only working for .txt files not for doc, docx, PDF files. I haven't checked for rest of the non .txt format but I want it work for all the formats.

For Non txt formats the files are getting generated but content from it is not visible.

So my question is How can I recreate the file using just InputStream object for all types file format.

Below is the sample code that I have been using to write the file.

    File file = new File("/home/sumit/Documents/newAbc.docx");
    File fileAbc = new File("/home/sumit/Documents/abc.docx");
    InputStream inputStream = new FileInputStream(fileAbc); // this is just for this sample code snippet but orginally I already have InputStream object.
    // if file doesnt exists, then create it
    if (!file.exists()) {
        file.createNewFile();
    }

    FileWriter fw = new FileWriter(file.getAbsoluteFile());
    BufferedWriter bw = new BufferedWriter(fw);

    int data = inputStream.read();
    while(data != -1) {
      //do something with data...
      bw.write(data);       
      data = inputStream.read();

    }
    inputStream.close();
    bw.close();

Above code is getting properly compiled and executed but the created file is not displaying content properly.

When searched I found I can not directly read Doc, PDf etc. from here: Reading .docx file in java

but from http://poi.apache.org/ POI library I did not find anything that will tell me how to write into file using InputStream.

Does anyone know how above problem can be solved?

Community
  • 1
  • 1
Sumit
  • 31
  • 1
  • 4
  • 1
    possible duplicate of [How read Doc or Docx file in java?](http://stackoverflow.com/questions/7102511/how-read-doc-or-docx-file-in-java) – Thirumalai Parthasarathi Oct 09 '13 at 08:09
  • how it is duplicate with http://stackoverflow.com/questions/7102511/how-read-doc-or-docx-file-in-java it just reading the file and printing it in console but I want to write it to file instead in same format of original file and this is not only for doc,docx formats but other formats too. – Sumit Oct 09 '13 at 08:28
  • 2
    `*Writer` classes are for text contexts and do conversions according to some encoding. Thus, you want to work with a `FileOutputStream`instead of a `FileWriter`. Furthermore you will eventually want to use a `byte[]` buffer to read into and write from. – mkl Oct 09 '13 at 08:35
  • @mkl, yes... just now tried it both ways.. worked fine with File i/o streams & byte array – Balaji Krishnan Oct 09 '13 at 08:37
  • [**this**](http://poi.apache.org/apidocs/overview-summary.html) may be of help to you – Thirumalai Parthasarathi Oct 09 '13 at 08:39
  • @mkl: Make that an answer, please. – Martin Schröder Oct 09 '13 at 20:47
  • This question appears to be off-topic because the fundamental problem was misunderstood, and the PDF/document context means that few will ever find this when looking for info about byte-for-byte file copies. The OP lacked minimal understanding of the problem, and the question quality has suffered beyond repair. – Jason C Mar 05 '14 at 02:16

1 Answers1

2

*Writer classes are for text contexts and do conversions according to some encoding, the standard platform encoding by default. Thus, you want to work with a FileOutputStream instead of a FileWriter to write binary file formats.

Furthermore you will eventually want to use a byte[] buffer to read into and write from for optimization purposes.

mkl
  • 90,588
  • 15
  • 125
  • 265