0

Well, I have been working in a app to display news headings and contents from the site http://www.myagdikali.com

I am able to extract the data from 'myagdikali.com/category/news/national-news/' but there are only 10 posts in this page and there are links to other pages as 1,2,3... like myagdikali.com/category/news/national-news/page/2.

All I need to know is, how do I extract news from every possible pages under /national_news ? Is it even possible using Jsoup ?

Till now my code to extract data from a single page is:

public View onCreateView(LayoutInflater inflater, ViewGroup container,
                         Bundle savedInstanceState) {
    View rootView = inflater.inflate(R.layout.fragment_all, container, false);
    int i = getArguments().getInt(NEWS);
    String topics = getResources().getStringArray(R.array.topics)[i];

    switch (i) {
        case 0:
            url = "http://myagdikali.com/category/news/national-news";
            new NewsExtractor().execute();

            break;
            .....


[EDIT]
private class NewsExtractor extends AsyncTask<Void, Void, Void> {
   String title;

@Override
protected Void doInBackground(Void... params) {

    while (status == OK) {
        currentURL = url + String.valueOf(page);


        try {
            response = Jsoup.connect(currentURL).execute();
            status = response.statusCode();
            if (status == OK) {

                Document doc = response.parse();
                Elements urlLists = doc.select("a[rel=bookmark]");
                for (org.jsoup.nodes.Element urlList : urlLists) {

                    String src = urlList.text();

                    myLinks.add(src);

                }
                title = doc.title();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    page++;

    }
    return null;




}

EDIT: While trying to extract data from single page without loop, I can extract the data. But after using while loop, I get the error stating No adapter attached.

Actually I am loading the extracted data in the RecyclerView and onPostExecute is like this:

    @Override
    protected void onPostExecute(Void aVoid) {
        layoutManager = new LinearLayoutManager(getActivity());
        recyclerView.setLayoutManager(layoutManager);

        myRecyclerViewAdapter = new     MyRecyclerViewAdapter(getActivity(),myLinks);
        recyclerView.setAdapter(myRecyclerViewAdapter);


    }
Roshan Gautam
  • 480
  • 1
  • 5
  • 15
  • Take a look here - http://stackoverflow.com/questions/28510578/no-adapter-attached-skipping-layout – TDG Jul 18 '15 at 13:00

1 Answers1

0

Since you know the URL of the pages you need - http://myagdikali.com/category/news/national-news/page/X (where X is the page number between 2 and 446), you can loop through the URLs. You'll also need to use the Jsoup's response, to make sure that the page exists (the number 446 can be changed - I believe that it increases).
The code should be something like this:

final String URL = "http://myagdikali.com/category/news/national-news/page/";
final int OK = 200;
String currentURL;
int page = 2;
int status = OK;
Connection.Response response = null;
Document doc = null;

while (status == OK) {
    currentURL = URL + String.valueOf(page);  //add the page number to the url
    response = Jsoup.connect(currentURL)
            .userAgent("Mozilla/5.0")
            .execute();  //you may add here userAgent/timeout etc.
    status = response.statusCode();
    if (status == OK) {
        doc = response.parse();
        //extract the info. you need
    }
    page++;
}

This is of course not fully working code - you'll have to add try-catch sentences, but the compiler will help you. Hope this helps you.

EDIT:
1. I've editted the code - I've had to send a userAgent string in order to get response from the server.
2. The code runs on my machine, it prints lots of ????, because I don't have the proper fonts installed.
3. The error you're getting is from the Android part - something to do with your views. You haven't posted that piece of code...
4. Try to add the userAgent, it might solve it.
5. Please add the error and the code you're running to the original question by editting it, it's much more readable.

TDG
  • 5,909
  • 3
  • 30
  • 51
  • It's still not working. Now I am getting this error. 07-18 17:03:10.967 4708-4708/np.info.roshan.benionline_webportal E/RecyclerView﹕ No adapter attached; skipping layout – Roshan Gautam Jul 18 '15 at 11:22
  • Let's try to pinpoint the problem: after the line `String src = urlList.text()` add the line `Log.d("NEWS", src);`. This will (hopefully) print the lines you're extracting to the `LogCat`. If you can see those lines in your Log, then my answer solved your original question and your new problem is the `RecyclerView`. Try to search for this error in google, I'm sure you'll find a solution. If you don't get the lines printed in the Log, will have to go on and fix it... – TDG Jul 18 '15 at 12:20
  • Okay wait!! I am getting more errors :D I m gonna try this. – Roshan Gautam Jul 18 '15 at 12:30
  • Yes Sir!! @GDG .. I can get data in LogCat. So there is problem in the RecyclerView. But I am able to extract data from single page (without loop) with the same code right now in RecyclerView. – Roshan Gautam Jul 18 '15 at 12:39
  • Then I think that I've solved your **original** question, and you should open a new question with the new problem. – TDG Jul 18 '15 at 12:44