0

I know that this question has been posed for C# and possibly other languages but I haven't found one for Objective-C (Xcode)

The C# question can be found here C# Version

Im looking to be take any URL (NSURL or NSString) and convert that webpages contents into.

1) The Title of that webpage (New article title)

2) The Image for that article (First major image)

3) The Article text itself (Pure text, no ads)

Those are the 3 major things. Id also like to have the

1) Author

2) Date Updated

3) Website that posted the article

but those are not as important.

The way I have my code set up to parse the actual article (which doesn't do the job I want exactly) is:

- (void)viewDidLoad {
    [super viewDidLoad];
    NSURLRequest *request = [NSURLRequest requestWithURL:finalUrl];

[self.webview loadRequest:request];

}

- (void)webViewDidFinishLoad:(UIWebView *)webView {
    NSString *fullArticle = [self.webview stringByEvaluatingJavaScriptFromString:@"document.body.innerText"];
    self.story.text = fullArticle;
    NSLog(@"Article:  %@",fullArticle);
}

finalUrl being a NSURL variable.

The NSLog shows all the text from the inner body of the webpage but includes a lot of extra "garbage" that I don't want, It also doesn't give back the title, images or anything else that I wanted.

So how can this be done in objective-C? I know that Pocket does it very well in there app.

Community
  • 1
  • 1
  • 1
    You'll have to parse the content, and look for the items/stuff you need. Maybe you should start looking for "how to parse html", or look for ready to use libs. – d4Rk Mar 27 '15 at 18:05

1 Answers1

0

Readability.com has an incredible API that allows for this. Its called the Parser API.

Steps:

  1. Create an account on readability

  2. Go to the Developer section

  3. Generate Tokens

  4. Use readability API url with your token and the URL you would like to parse.

It will return a HTML page filled with everything you want.