-1

I'm trying to search for text string, say "can be", in document which is located on 'https://developer.apple.com/library/ios/documentation/ides/conceptual/AppDistributionGuide/AppDistributionGuide.pdf'

For this purpose I'm using PDFQuery. Initially I downloaded the pdf on my machine and did my code. It is working perfect. But when I tried to input the server url in file location it is showing me error. I know the PDFQuery library is developed to work on local machines.

Is there any way that I can figure out something and fix my problem. This is part of my course project and the pdf search module that I am supposed to develop is to be deployed on IBM Bluemix and run it from there. Only this part is pending in my project. Any help is appreciated.

Thank you in advance.

1 Answers1

0

Break the problem into two bits.

i) download the file. ii) process the file.

Here's some help with step i) How do I download a file over HTTP using Python?

Community
  • 1
  • 1
demented hedgehog
  • 7,007
  • 4
  • 42
  • 49
  • Thank you Demented. But when I download the file, it would be on my local machine. Then I would not be able to access this file from bluemix. – Milind Mahajan Nov 29 '14 at 11:55
  • Isn't your code going to run on bluemix? (I mean download the pdf using code) – demented hedgehog Nov 29 '14 at 11:59
  • I'll be deploying this application to Bluemix. So even if I download the file, that will be stored in my local file system. I cannot access the file from local file system in bluemix. – Milind Mahajan Nov 29 '14 at 12:15
  • You misunderstand me. Make yourself some little wrapper app that downloads the file to the local machine and then runs PDFQuery from there. When that code is run on Bluemix then the pdf will be local to it. Use bash and wget if you don't want to use python. – demented hedgehog Nov 29 '14 at 12:23
  • I have another question. How often do you think AppDistributionGuide.pdf will change? Do you really need to get it from the web all the time? – demented hedgehog Nov 29 '14 at 12:26
  • Demented, it's not only AppDistributionGuide.pdf file. That's a sample. During final presentation the instructor might ask for different file on server. And if I download the file on my local FS how can I run PDFQuery there and get the result on bluemix. I might be making error in getting your point. Can you please guide me? – Milind Mahajan Nov 29 '14 at 12:31
  • Well in that case you may have two problems. When the instructor asks for a pdf how is your program going to know the url of the pdf to get? Somehow you have to communicate that information to the program. (or you can send it the pdf manually). Alternatively maybe you're expected to parse all the pdfs on the site? Maybe you should talk with someone there and clarify the requirements for the project. – demented hedgehog Nov 29 '14 at 22:16
  • My pleasure. Good luck. Sorry I couldn't have been more help. – demented hedgehog Nov 30 '14 at 09:04