1

I want to create an Azure Function that gets triggered anytime a file is uploaded to blob storage and extracts the text of a PDF file. I don't know what would be the best library to use either.

I found this post that shows how to use PdfSharp to extract the text of a PDF file but I can't seem to get it working since It's my first time using Azure Functions.

Xavi Andreu
  • 101
  • 2
  • 12
  • Welcome to StackOverflow! Please describe your question instead of writing a kind of blog what you did, sure it's good to include your research, but please put more focus on the question itself :) – Hille Oct 08 '19 at 14:57
  • 1
    Edited the post, hope it is clearer now. Thank you for your feedback :) – Xavi Andreu Oct 08 '19 at 15:55
  • You should start by understanding how to use an event grid trigger to trigger execution of an Azure Function. Once you have that you will have a function that is called whenever a blob is added with the address of the blob. – Aran Mulholland Oct 10 '19 at 00:41

1 Answers1

1

This question is overly broad and will probably be closed as such. But here are some pointers.

  1. Start by installing the Azure Storage Emulator so that you can create Blobs locally for testing. Get it here.
  2. Create an Azure Function v2. Set up a Blob Storage Trigger so that whenever something is written to your local storage, the trigger will be called. Blob trigger described here.
  3. Once you can hit a breakpoint in your Azure Function when a Blob is added to your local emulator, you'll need to get the bytes and extract the text using a PDF ripper of your choice. There are many, some are free, and some are paid. Suggesting one and giving code examples could run several thousand words, so it's up to you which one you pick and use.
Rob Reagan
  • 7,313
  • 3
  • 20
  • 49
  • 1
    The question is overly broad, the answer is perfect for the question. The blob triggers work well though via an event grid binding. – Aran Mulholland Oct 10 '19 at 00:38