How to READ/PARSE the PDF contents loaded in an UIWEBVIEW

Question

Assume the situation, i have an UIWebView and signed in into my gmail account, i have received an e-mail with an attached pdf file. Now am clicking on the attachment link the webview will open the pdf file there itself.

At this time if am trying to read the html contents it returns NULL.

Please help me by giving an idea. Now how could i read/parse that pdf content already loaded in webview and store the data into NSString.

Using PDFKitten, You can extract text from PDF. But here we are not using UIWebView :http://stackoverflow.com/questions/4097044/pdf-search-on-the-iphone/6506722#6506722 — Naveen Thunga, Jun 08 '12 at 11:12
Thanks naveen. humm... do you have any idea about how the safari browser open the pdf file. i mean the address shown in the address bar deosn't ends with .pdf. i really can't understand in which format the pdf file loaded on Ui webview..:( — Veera Raj, Jun 11 '12 at 05:42

score 2 · Answer 1 · answered Jun 08 '12 at 10:27

2

if you have the URL for the PDF file then directly load the URL in the WebView otherwise you have to use the NSURLRequest to download the file and then load the file from the local directory of the application. I shall be glad to provide you with the code of loading the PDF file from the local directory however I suggest you to google it.

answered Jun 08 '12 at 10:27

Farrukh Javeid

634
6
25

Thanks for ur kind full response Farrukh, as like you said i just load the pdf file directly from the URL. But now my need is to extract the PDF text as NSString.. – Veera Raj Jun 08 '12 at 10:41
You cannot download it as a string. It will be downloaded in the form of NSData but since the architecture of this file format does not support that facility so saving the file as PDF will be you only option. – Farrukh Javeid Jun 08 '12 at 10:48
@ FARRUKH -- if the pdf content is downloaded as NSDATA, is there any possibility to convert like following NSString* currentURL = webView.request.URL.absoluteString; NSURL* url = [NSURL URLWithString:currentURL]; NSURLRequest* request1 = [NSURLRequest requestWithURL:url]; NSURLResponse* response; NSError* error; NSData* result = [NSURLConnection sendSynchronousRequest:request1 returningResponse:&response error:&error]; NSString *myString = [[NSString alloc] initWithData:result encoding:NSASCIIStringEncoding];// – Veera Raj Jun 08 '12 at 12:43
You can but the result will definitely not be what you are expecting. Not the desired result! – Farrukh Javeid Jun 08 '12 at 12:49
yes...but that method i have mentioned above works for word,powerpoint and xls files. but unfortunately return nothing for a PDF file...:( – Veera Raj Jun 11 '12 at 09:35
is the .doc format or the .docx format? because docx is an XML-based file type as well. Much to my knowledge, and I can be wrong, the plain file format of the MS Office files before the 2007 version can be read as a plain file however such a case is not possible with the later versions. And so is the case of PDF files. – Farrukh Javeid Jun 11 '12 at 10:33
PROBLEM SOLVED.. FOR PDF EXTRACTION we have to use any third party api's. for my problem i have used FASTPDFKIT API. which has enormous functions to do all most every thing related to pdf files. NSString* currentURL = webView.request.URL.absoluteString; NSLog(@"NSDATA FOUND// current url--%@",currentURL); NSURL* url = [NSURL URLWithString:currentURL]; MFDocumentManager *documentManager = [[MFDocumentManager alloc]initWithFileUrl:url]; chumma1 = [NSMutableString stringWithString:[documentManager wholeTextForPage:pq] ] ; – Veera Raj Jun 13 '12 at 09:50
now the chumma2 has the string of the pdf file for the desired page. pq - pagenumber. http://fastpdfkit.com/ – Veera Raj Jun 13 '12 at 09:54
> i haven't check with .docx file. for .doc file it works. even fullArticle = [webView stringByEvaluatingJavaScriptFromString:@"document.body.innerText"]; can extract the .doc, .ppt file text. – Veera Raj Jun 13 '12 at 09:55

How to READ/PARSE the PDF contents loaded in an UIWEBVIEW

1 Answers1