0

I'm designing a C# console application that searches sentences in a PDF text (i.e multiple words separated by space , and may be separated by line) and a title sentence that may take more than one line .

I get the pdf text in a string then match the given sentences against them , for example i'm matching a pdf string against a title like this : "Lower leg compartment syndrome post \nappendicectomy"

i searched around here and found some very very useful links as the one in here

how to highlight a text or word in a pdf file using iTextsharp?

the point is i need the search to allow searching the whole sentence even if it's separated by lines then highlight it when matches.. can any one help ?

Community
  • 1
  • 1
  • Watch [this video](https://www.youtube.com/watch?v=lZnbhnU4m3Y) to understand what PDF is about. Then read the questions in the section "Extracting text from PDFs" in [The Best iText Questions on StackOverflow](http://pages.itextpdf.com/ebook-stackoverflow-questions.html) (it's a free ebook, download it). – Bruno Lowagie Feb 09 '15 at 07:47
  • In addition to reading information on how to extract text from PDFs, you should also consider normalizing the extracted text, e.g. by merging multiple white spaces into a single one. – mkl Feb 09 '15 at 08:12
  • actually extracting the text and match it against a given pattern , to make sure it exists is not the problem . My problem is how to highlight the right text even when found separated by lines – Sarah Mohammad Feb 09 '15 at 08:35
  • I'm sorry i can't edit the post's title to add "highlight found text". my mistake – Sarah Mohammad Feb 09 '15 at 08:37
  • I edited the title accordingly. Indeed, your question still does not make clear what you have: According to a comment *extracting the text and match it against a given pattern , to make sure it exists is not the problem* while the question seems to indicate that that is also unsolved. Thus, please indicate what you have. Especially, as you already have made the match, what kind of information do you have for the match? Considering the *useful links* you refereed to you should have some `Rectangle` collection at hands. – mkl Feb 09 '15 at 09:16
  • yes i was in a hurry this morning..my apology again for this misleading ! i used the implementation of **myLocationTextExtractionStrategy** in the link i refereed to , as a reference in my C# project which was very useful , but unfortunately and as the code author said : **No multiple line searches (phrases), just char/s or word's or a one line sentence are allowed.** , what i have now is a sentence like **Lower leg compartment syndrome post \nappendicectomy** which is separated on two lines to be matched and highlighted.. so any help ? – Sarah Mohammad Feb 09 '15 at 09:31

0 Answers0