0

I'm not familiar with pdf rendering system or postscript, and I'd like to know if in principle - it would be possible to extract the location of a string in a pdf. that is:

  1. given a pdf with regular text paragraphs (not form-fileds\text boxes or other objects, simple text)
  2. search for a specific string in the file
  3. get the x,y coordinates of that the first letter.

I've searched pdf-libs in many languages but they don't seem to allow such operation.

does pdf standard supports this?

KenS
  • 30,202
  • 3
  • 34
  • 51
itay zohar
  • 177
  • 10

1 Answers1

0

The closest thing I could find involves finding the location of a text box (see here)

Depending on your use case, this could help. for instance, in my case, I wanted to replace a specified string with another string. A possible solution for me:

  1. Include a text box in the original pdf (the author of the pdf can do that using adobe acrobat pro or equivalent)
  2. Find the text box using code and extract it's location
  3. remove the text box from the document and insert your text at the extracted position.
itay zohar
  • 177
  • 10