1

I have committed a 10 meg jar file to a git repo and i would like to download the file using Java. The file can be downloaded from github directly but when I attempt to curl it from the command line (as an experiment -- I definitely need to do this from a program) two things: 1. It appears to proceed too fast -- I don't think I am getting the actual jar. 2. I think I am getting HTML. I know about "raw" for source files but there is no raw option offered on Github for binaries apparently.

In the program what I was able to do with source source was read the download 1 line at a time and append those lines, eventually recreating the file. Not sure how this will work with a large jar file.

EDIT: There is a "view raw" option which appends ?raw=true to the URL but that is not working from Java and when I try from command line, it does not seem to work -- still getting html and download goes too fast.

EDIT: Here is the command line curl of a jar file that is present in github: curl -u testuser https://github.com/test/test-api/blob/master/testjarfile.jar?raw=true

The above produces results but I dont think it is what we need. Here is the Java code:

URL url;
    String username="testuser";
    String password= "testpass";
    StringBuilder file = new StringBuilder();//deliberately not thread-safe
    try {
        url = new URL("https://github.com/test/test-api/blob/master/testjarfile.jar?raw=true");
        URLConnection uc;
        uc = url.openConnection();

        uc.setRequestProperty("X-Requested-With", "Curl");
        String userpass = username + ":" + password;
        String basicAuth = "Basic " + new String(Base64.getEncoder().encodeToString(userpass.getBytes()));//needs Base64 encoder, apache.commons.codec
        uc.setRequestProperty("Authorization", basicAuth);

        BufferedReader reader = new BufferedReader(new InputStreamReader(uc.getInputStream()));
        String line = null;
        while ((line = reader.readLine()) != null) 
            file.append(line+"\n");
        System.out.println(file);

And this Java code gives file not found. Note that from the github webpage, it appears that the jarfile is actually downloaded. Note also the "raw=true" which I would have guessed would have caused the raw file to be be downloaded, not html, etc.

releseabe
  • 323
  • 3
  • 13
  • Please post your current code for context. For your purposes, you definitely do not want to read it line-by-line. I am assuming you are using `BufferedReader` to accomplish this. See https://www.baeldung.com/java-download-file. I doubt GitHub requires headers to download a release file, but you may want to do some research on opening a connection with headers. – Oliver May 26 '20 at 04:16

1 Answers1

1

Try first the curl command described in "How can I download a single raw file from a private github repo using the command line?", to check the curl itself works from command-line.

curl -H 'Authorization: token YOUR_TOKEN' \
  -H 'Accept: application/vnd.github.v4.raw' \
  -O \
  -L https://api.github.com/repos/INSERT_OWNER_HERE/INSERT_REPO_HERE/contents/PATH/TO/FILE

Then use rockswang/java-curl for instance to translate that curl call into Java.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • what if no token, only username and password? – releseabe May 26 '20 at 06:06
  • @releseabe First, if the repo is public, you don't need an Authorization header at all. Second, if not, you can generate an OAuth token yourself: https://gist.github.com/joyrexus/85bf6b02979d8a7b0308#oauth – VonC May 26 '20 at 06:08
  • so password, even if u have one, is not used? – releseabe May 26 '20 at 06:15
  • @releseabe no, not for authorization header – VonC May 26 '20 at 06:38
  • Is the api.github.com always correct? I am getting a 404 when I try to download the jar. However, the url that the file is found under on github does not mention api.github.com but rather just github.com – releseabe May 27 '20 at 12:41
  • @releseabe From https://developer.github.com/v3/, yes `api.github.com` is the one to use. Is it a private repo? – VonC May 27 '20 at 18:09
  • @releseabe OK: 404 is the standard answer when the authentication failed, for a private resource. – VonC May 28 '20 at 05:05
  • @releseabe It is a security best practice, to avoid confirming the existence of a user when bad credentials are provided. – VonC May 28 '20 at 05:10