download a pdf using java

Question

I'm writing a program to download a PDF file from server. I'm using some program given here Download file by passing URL using java code, this solution works fine for the sample URL provided in the first answer, but not for PDF, I'm replacing just the URL. Below is my code.

import java.io.*;
import java.net.*;

public class FileDownloadTest {
    final static int size = 1024;

    public static void fileUrl(String fAddress, String localFileName, String destinationDir) {

        // localFileName = "Hello World";
        OutputStream outStream = null;
        URLConnection uCon = null;

        InputStream is = null;
        try {
            URL url;
            byte[] buf;
            int byteRead, byteWritten = 0;
            url = new URL(fAddress);
            outStream = new BufferedOutputStream(new FileOutputStream(destinationDir + "\\" + localFileName));

            uCon = url.openConnection();
            is = uCon.getInputStream();
            buf = new byte[size];
            while ((byteRead = is.read(buf)) != -1) {
                outStream.write(buf, 0, byteRead);
                byteWritten += byteRead;
            }
            System.out.println("Downloaded Successfully.");
            System.out.println("File name:\"" + localFileName + "\"\nNo ofbytes :" + byteWritten);
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                is.close();
                outStream.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

    public static void fileDownload(String fAddress, String destinationDir) {
        int slashIndex = fAddress.lastIndexOf('/');
        int periodIndex = fAddress.lastIndexOf('.');

        String fileName = fAddress.substring(slashIndex + 1);

        if (periodIndex >= 1 && slashIndex >= 0 && slashIndex < fAddress.length() - 1) {
            fileUrl(fAddress, fileName, destinationDir);
        } else {
            System.err.println("path or file name.");
        }
    }

    public static void main(String[] args) {
        String fAddress = "http://singztechmusings.files.wordpress.com/2011/09/maven_eclipse_and_osgi_working_together.pdf";
        String destinationDir = "D:\\FileDownload";
        fileDownload(fAddress, destinationDir);

    }
}

Here, This pdf has 73 pages, and in my folder, it is download as a PDF of 1KB, when opened in Acrobat Reader, it says that the file might be corrupted.

I've also tried the code provided here https://dzone.com/articles/java-how-save-download-file, but the result is same.

please let me know how can I fix this.

Thanks

Madis Pärn · Accepted Answer · 2016-03-23T10:50:57.343

1

If you check the downloaded file content, you can see it is html. The server is redirecting the original request to https url. Use url https://singztechmusings.files.wordpress.com/2011/09/maven_eclipse_and_osgi_working_together.pdf instead.

Or use http client with automatic redirect handling, ala http-commons

edited Mar 23 '16 at 10:50

answered Mar 23 '16 at 08:51

Madis Pärn

106
5

Hi Mandis. Wohooooooooo. This worked awesome. I've a quick question, if I've appended `s` to `http` sites will that work? – user3872094 Mar 23 '16 at 08:59
No. The site must support https protocol, and also the content can be different when using http vs https. Better solution would be to use http client that handlest the redirects automatically, for example [http-commons](https://hc.apache.org/httpcomponents-client-4.5.x/tutorial/html/fundamentals.html#d5e334). – Madis Pärn Mar 23 '16 at 10:48

score 0 · Answer 2 · answered Mar 23 '16 at 09:01

0

You define a Variable size = 1024 and use this to define your Buffer. So logically you can only write 1 KB into it. But if the input Stream reads more at once it will be lost ... So change your Buffer size to a value which would be able to contain most documents or try to determine the necessary size

answered Mar 23 '16 at 09:01

ikarus

11
1
2

Thanks for the suggestion, http://stackoverflow.com/questions/36173363/download-a-pdf-using-java/#36173522 Worked for me – user3872094 Mar 23 '16 at 09:04
@ikarus, if you look at the code more accurately, you'll see that the buffer is used in a copying loop. Thus, nothing is lost. A buffer value of 1024 may be considered small, though, and I'd recommend a higher value for better performance. – mkl Mar 23 '16 at 09:15
@mkl oh, of course, thanks for the info. its been a wile since i wrote my last java program, and just forgot that arrays "know their size" so `read(buf)` knows when to stop. – ikarus Mar 23 '16 at 09:32

download a pdf using java

2 Answers2