46

I am trying to read some words from an online text file.

I tried doing something like this

File file = new File("http://www.puzzlers.org/pub/wordlists/pocket.txt");
Scanner scan = new Scanner(file);

but it didn't work, I am getting

http://www.puzzlers.org/pub/wordlists/pocket.txt 

as the output and I just want to get all the words.

I know they taught me this back in the day but I don't remember exactly how to do it now, any help is greatly appreciated.

Matthias Braun
  • 32,039
  • 22
  • 142
  • 171
randomizertech
  • 2,309
  • 15
  • 48
  • 85
  • Possible duplicate of [How do you Programmatically Download a Webpage in Java](https://stackoverflow.com/questions/238547/how-do-you-programmatically-download-a-webpage-in-java) – Robin Green Nov 17 '18 at 07:35

9 Answers9

69

Use an URL instead of File for any access that is not on your local computer.

URL url = new URL("http://www.puzzlers.org/pub/wordlists/pocket.txt");
Scanner s = new Scanner(url.openStream());

Actually, URL is even more generally useful, also for local access (use a file: URL), jar files, and about everything that one can retrieve somehow.

The way above interprets the file in your platforms default encoding. If you want to use the encoding indicated by the server instead, you have to use a URLConnection and parse it's content type, like indicated in the answers to this question.


About your Error, make sure your file compiles without any errors - you need to handle the exceptions. Click the red messages given by your IDE, it should show you a recommendation how to fix it. Do not start a program which does not compile (even if the IDE allows this).

Here with some sample exception-handling:

try {
   URL url = new URL("http://www.puzzlers.org/pub/wordlists/pocket.txt");
   Scanner s = new Scanner(url.openStream());
   // read from your scanner
}
catch(IOException ex) {
   // there was some connection problem, or the file did not exist on the server,
   // or your URL was not in the right format.
   // think about what to do now, and put it here.
   ex.printStackTrace(); // for now, simply output it.
}
Community
  • 1
  • 1
Paŭlo Ebermann
  • 73,284
  • 20
  • 146
  • 210
  • 1
    I'm getting an error though Exception in thread "main" java.lang.Error: Unresolved compilation problems: Unhandled exception type MalformedURLException Unhandled exception type IOException – randomizertech Jun 07 '11 at 00:05
  • wrap it in a try/catch block and catch those 2 exceptions. – Sean Jun 07 '11 at 00:10
  • 1
    I'm sorry but I got lost, shouldn't this be easy and able to be done in 2 or 3 lines of code? – randomizertech Jun 07 '11 at 00:39
  • I tried this method but got: _java.io.IOException: Server returned HTTP response code: 403 for URL:..._ any ideas? – theexplorer Nov 30 '14 at 07:35
  • 1
    @theexplorer see https://en.wikipedia.org/wiki/HTTP_403, for example. It looks like your server is configured to don't allow this file to be downloaded. – Paŭlo Ebermann Nov 30 '14 at 11:04
  • 1
    I understand. thank you. is it smart to ask the hosting to turn this security switch off? – theexplorer Nov 30 '14 at 13:50
13

try something like this

 URL u = new URL("http://www.puzzlers.org/pub/wordlists/pocket.txt");
 InputStream in = u.openStream();

Then use it as any plain old input stream

hhafez
  • 38,949
  • 39
  • 113
  • 143
9

What really worked to me: (source: oracle documentation "reading url")

 import java.net.*;
 import java.io.*;

 public class UrlTextfile {
public static void main(String[] args) throws Exception {

    URL oracle = new URL("http://yoursite.com/yourfile.txt");
    BufferedReader in = new BufferedReader(
    new InputStreamReader(oracle.openStream()));

    String inputLine;
    while ((inputLine = in.readLine()) != null)
        System.out.println(inputLine);
    in.close();
}
 }
chris
  • 91
  • 1
  • 1
6

Using Apache Commons IO:

import org.apache.commons.io.IOUtils;

import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import java.nio.charset.StandardCharsets;

public static String readURLToString(String url) throws IOException
{
    try (InputStream inputStream = new URL(url).openStream())
    {
        return IOUtils.toString(inputStream, StandardCharsets.UTF_8);
    }
}
BullyWiiPlaza
  • 17,329
  • 10
  • 113
  • 185
Jawad
  • 308
  • 3
  • 12
3

Use this code to read an Internet resource into a String:

public static String readToString(String targetURL) throws IOException
{
    URL url = new URL(targetURL);
    BufferedReader bufferedReader = new BufferedReader(
            new InputStreamReader(url.openStream()));

    StringBuilder stringBuilder = new StringBuilder();

    String inputLine;
    while ((inputLine = bufferedReader.readLine()) != null)
    {
        stringBuilder.append(inputLine);
        stringBuilder.append(System.lineSeparator());
    }

    bufferedReader.close();
    return stringBuilder.toString().trim();
}

This is based on here.

BullyWiiPlaza
  • 17,329
  • 10
  • 113
  • 185
2

I did that in the following way for an image, you should be able to do it for text using similar steps.

// folder & name of image on PC          
File fileObj = new File("C:\\Displayable\\imgcopy.jpg"); 

Boolean testB = fileObj.createNewFile();

System.out.println("Test this file eeeeeeeeeeeeeeeeeeee "+testB);

// image on server
URL url = new URL("http://localhost:8181/POPTEST2/imgone.jpg"); 
InputStream webIS = url.openStream();

FileOutputStream fo = new FileOutputStream(fileObj);
int c = 0;
do {
    c = webIS.read();
    System.out.println("==============> " + c);
    if (c !=-1) {
        fo.write((byte) c);
    }
} while(c != -1);

webIS.close();
fo.close();
cristid9
  • 1,070
  • 1
  • 17
  • 37
Alok D
  • 73
  • 3
2

For an old school input stream, use this code:

  InputStream in = new URL("http://google.com/").openConnection().getInputStream();
hhafez
  • 38,949
  • 39
  • 113
  • 143
Bohemian
  • 412,405
  • 93
  • 575
  • 722
0

Alternatively, you can use Guava's Resources object:

URL url = new URL("http://www.puzzlers.org/pub/wordlists/pocket.txt");
List<String> lines = Resources.readLines(url, Charsets.UTF_8);
lines.forEach(System.out::println);
Matthias Braun
  • 32,039
  • 22
  • 142
  • 171
0

corrected method is deprecated now. It is giving the option private WeakReference<MyActivity> activityReference; here solution will useful.