3

I want to read out a file in Android and get the content as a string. Then I want to send it to a server. But for testing I just create a file on the device and put the content into it:

InputStream stream = getContentResolver().openInputStream(fileUri);
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));

File dir = new File (Environment.getExternalStorageDirectory() + "/Android/data/" + getPackageName());
if(!dir.exists())
    dir.mkdirs();
File file = new File(dir, "output."+format); // format is "txt", "png" or sth like that

if(!file.exists())
    file.createNewFile();

BufferedWriter writer = null;
writer = new BufferedWriter(new FileWriter(file));

String line = reader.readLine();

while (line != null)
{
    writer.write(line);
    line = reader.readLine();
    if(line != null)
        writer.write("\n");
}
writer.flush();
writer.close();
stream.close();

This works for txt files but when I for example try to copy a pdf file it is openable but just white.

Can anyone help me?

Thanks

ByteHamster
  • 4,884
  • 9
  • 38
  • 53
Damian Jäger
  • 204
  • 2
  • 12
  • 1
    That's because a PDF doesn't simply store text, like a .txt file does. The PDF format stores a whole bunch of other info into the file such as font type, size and spacing, amongst others. – D. Visser May 12 '15 at 16:19
  • It still consists of symbols. Isn't a string able to hold every symbol? – Damian Jäger May 12 '15 at 16:23
  • 1
    Imagine reading the contents of an internet page. You wouldn't just read out text, which a regular user sees. Instead, you would read all kinds of tags, which are specified in the HTML standard, like so `

    Hello, world>`. PDF is much like that, except it allows for the use of a lot of other data types as well, like bezier curves and it has official documentation which tells you how to construct and read a PDF file.

    – D. Visser May 12 '15 at 16:28
  • The code would read the source code if it was a internet page – Damian Jäger May 12 '15 at 16:30
  • Try changing the charset to UTF-8 – D. Visser May 12 '15 at 16:31
  • I tried it in the reader but couldn't figure out how to change it in the writer – Damian Jäger May 12 '15 at 16:34
  • Try using the [OutputStreamWriter](http://stackoverflow.com/a/6998929/2929693) – D. Visser May 12 '15 at 16:38
  • In case you need the PDF standard specifications, here is the [700+ page PDF](http://www.google.nl/url?sa=t&rct=j&q=iso+32000+pdf&source=web&cd=1&cad=rja&uact=8&ved=0CCIQFjAA&url=http%3A%2F%2Fwww.adobe.com%2Fdevnet%2Facrobat%2Fpdfs%2FPDF32000_2008.pdf&ei=vixSVbHkCau7ygO6lYGIDg&usg=AFQjCNG9qBiYijpvjWau9BHVSt4SRRyKSA&sig2=WD0_FmJW6WCEWwtN0RRqIg). – D. Visser May 12 '15 at 16:40

1 Answers1

3

I want to read out a file in Android and get the content as a string.

PDF files are not text files. They are binary files.

Then I want to send it to a server

Your Android application has very limited heap space. It will be better if you did not read the whole file into memory, but rather streamed it in and sent it to the server a chunk at a time.

This works for txt files but when I for example try to copy a pdf file it is openable but just white.

That is because you are trying to treat a PDF file as a text file. Do not do this. Copy it as a binary file.

Community
  • 1
  • 1
CommonsWare
  • 986,068
  • 189
  • 2,389
  • 2,491
  • 1
    @DJDaJa: That just indicates how many bytes to read in at a time. Personally, I usually go bigger than the `1024` used in that example (e.g., `8192` for 8KB). – CommonsWare May 12 '15 at 18:00