12

I'm new to Android and I want to get the whole text from a web page to a string. I found a lot of questions like this but as I said I'm new to Android and I don't know how to use them in my app. I'm getting errors. Only one method I managed to get it to work, it uses WebView, and JavaScript and it is slow as hell. Can someone please tell me some other way to do this or how to speed up the WebView since I don't use it at all for viewing content. By the way, I've added the following code to speed up WebView

webView.getSettings().setJavaScriptEnabled(true); 
    webView.getSettings().setBlockNetworkImage(true);
    webView.getSettings().setJavaScriptCanOpenWindowsAutomatically(false);
    webView.getSettings().setPluginsEnabled(false);
    webView.getSettings().setSupportMultipleWindows(false);
    webView.getSettings().setSupportZoom(false);
    webView.getSettings().setSavePassword(false);
    webView.setVerticalScrollBarEnabled(false);
    webView.setHorizontalScrollBarEnabled(false);
    webView.getSettings().setAppCacheEnabled(false);
    webView.getSettings().setCacheMode(WebSettings.LOAD_NO_CACHE);

And please if you know other better and faster solution than using WebView please give me the whole source code of main activity or explain where I'm supposed to write it so I don't get errors.

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
null
  • 757
  • 1
  • 8
  • 23
  • If you want just the text out of a particular html element, you could take a look at [JSoup](http://jsoup.org/). – A--C Jan 19 '13 at 20:24

3 Answers3

28

Use This:

public class ReadWebpageAsyncTask extends Activity {
    private TextView textView;

    /** Called when the activity is first created. */
    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);
        textView = (TextView) findViewById(R.id.TextView01);
    }

    private class DownloadWebPageTask extends AsyncTask<String, Void, String> {
        @Override
        protected String doInBackground(String... urls) {
            String response = "";
            for (String url : urls) {
                DefaultHttpClient client = new DefaultHttpClient();
                HttpGet httpGet = new HttpGet(url);
                try {
                    HttpResponse execute = client.execute(httpGet);
                    InputStream content = execute.getEntity().getContent();

                    BufferedReader buffer = new BufferedReader(
                            new InputStreamReader(content));
                    String s = "";
                    while ((s = buffer.readLine()) != null) {
                        response += s;
                    }

                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
            return response;
        }

        @Override
        protected void onPostExecute(String result) {
            textView.setText(Html.fromHtml(result));
        }
    }

    public void readWebpage(View view) {
        DownloadWebPageTask task = new DownloadWebPageTask();
        task.execute(new String[] { "http://www.google.com" });

    }
}

main.xml

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    android:orientation="vertical"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    >

    <Button android:layout_height="wrap_content" android:layout_width="match_parent" android:id="@+id/readWebpage" android:onClick="readWebpage" android:text="Load Webpage"></Button>
    <TextView android:id="@+id/TextView01" android:layout_width="match_parent" android:layout_height="match_parent" android:text="Example Text"></TextView>
</LinearLayout>
K_Anas
  • 31,226
  • 9
  • 68
  • 81
8

This is the code I generally use to download a string from the internet

class RequestTask extends AsyncTask<String, String, String>{

@Override
// username, password, message, mobile
protected String doInBackground(String... url) {
    // constants
    int timeoutSocket = 5000;
    int timeoutConnection = 5000;

    HttpParams httpParameters = new BasicHttpParams();
    HttpConnectionParams.setConnectionTimeout(httpParameters, timeoutConnection);
    HttpConnectionParams.setSoTimeout(httpParameters, timeoutSocket);
    HttpClient client = new DefaultHttpClient(httpParameters);

    HttpGet httpget = new HttpGet(url[0]);

    try {
        HttpResponse getResponse = client.execute(httpget);
        final int statusCode = getResponse.getStatusLine().getStatusCode();

        if(statusCode != HttpStatus.SC_OK) {
            Log.w("MyApp", "Download Error: " + statusCode + "| for URL: " + url);
            return null;
        }

        String line = "";
        StringBuilder total = new StringBuilder();

        HttpEntity getResponseEntity = getResponse.getEntity();

        BufferedReader reader = new BufferedReader(new InputStreamReader(getResponseEntity.getContent()));  

        while((line = reader.readLine()) != null) {
            total.append(line);
        }

        line = total.toString();
        return line;
    } catch (Exception e) {
        Log.w("MyApp", "Download Exception : " + e.toString());
    }
    return null;
}

@Override
protected void onPostExecute(String result) {
    // do something with result
}
}

And you can run the task with

new RequestTask().execute("http://www.your-get-url.com/");

Ahmed Aeon Axan
  • 2,139
  • 1
  • 17
  • 30
  • As I understand I should put new RequestTask() in onCreate() and the other code outside of onCreate() right? if that's correct, I am getting error on TAG - TAG cannot be resolved to a variable. how to fix this? and string line is the whole text from page right? – null Jan 19 '13 at 20:02
  • yes. string line is the whole response as a string. and TAG is just a string variable i had defined earlier. replace it with any string of your chocie. I have modified my answer. And yes, you could put the new RequestTask() anywhere when you want to start the download. – Ahmed Aeon Axan Jan 20 '13 at 18:19
3

Seeing as you aren't interested in Viewing the content at all, try using the following:

In order to get your source code from an URL you can use this :

HttpClient httpclient = new DefaultHttpClient(); // Create HTTP Client
HttpGet httpget = new HttpGet("http://yoururl.com"); // Set the action you want to do
HttpResponse response = httpclient.execute(httpget); // Executeit
HttpEntity entity = response.getEntity(); 
InputStream is = entity.getContent(); // Create an InputStream with the response
BufferedReader reader = new BufferedReader(new InputStreamReader(is, "iso-8859-1"), 8);
StringBuilder sb = new StringBuilder();
String line = null;
while ((line = reader.readLine()) != null) // Read line by line
    sb.append(line + "\n");

String resString = sb.toString(); // Result is here

is.close(); // Close the stream

Make sure you run this off the main UI thread in an AsyncTask or in a Thread.

Raghav Sood
  • 81,899
  • 22
  • 187
  • 195