1

I have found many sites and postings that the question is the same as mine but what they all seem to have in common is people are answering them with examples of how to insert new text at specific locations. I have a PDF document that is generated by another program that I have no control over and it has a line for a client to sign on but that line is not in an absolute position so a service that we use called AssureSign will not work properly because you have to know where the position of the signature line is. So I need to create a new program that will find the position of the signature line and send that information to the assuresign system.

This really should be simple but for some reason I am not getting it

scripter78
  • 1,117
  • 3
  • 22
  • 50

3 Answers3

1

You can make use of the parser package of iText (Sharp) to find the position of a given text. You do have to implement your own RenderListener, though, as the main use case of that package is text extraction, not text position finding.

It is not as easy as you might think as e.g. the individual characters of the words might come in separately in any order.

PS:

First you will have to find out, though, whether the line for the signature consists of characters (as your question seems to imply) or whether it is a drawn path. Additionally you will have to find out whether that line is unique in the document.

In the former case, the RenderListener implementation you need has to inspect the TextRenderInfo objects forwarded for processing in its RenderText method. If its text content contains those unique characters building the signatrue line, you have to store the position data of this TextRenderInfo. If the line characters are not unique, you will have to find some additional criteria making them unique, e.g. some preceding string or possibly a fact that its the last occurance of those characters in the document.

In the latter case the parser package functionality has to be somewhat extended as it currently does not report paths. According to the iText mailing list, an extension like that is on the ToDo list.

mkl
  • 90,588
  • 15
  • 125
  • 265
  • I think I found an example of what you are referring to but I am not sure how to use it. http://pastebin.com/LqDRDRd9 – scripter78 Oct 13 '12 at 20:34
  • You found a sample someone derived from the original iText LocationTextExtractionStrategy which is a RenderListener. You need to build a different RenderListener which looks for the string you want to find and eventually returns its position. – mkl Oct 14 '12 at 20:08
0

This question isn't directly related to what you want to accomplish, but it is indirectly related

JCIS posted a great application that shows you the very arduous task of locating specific text, albeit with VB. It wouldn't be as simple as plugging it into a vb > c# converter, but it should be translatable. This may seem like an easy task to accomplish you might think, but PDF is not a document format, it's a display format technically. The difference between those 2 is what makes this such a long process.

Community
  • 1
  • 1
Mike Varosky
  • 390
  • 1
  • 10
-1

First, in case just words are english , you can find parse easily, but when your documents is not english language, you should understand the font of your language exactally UNICODE.

Nikhil
  • 16,194
  • 20
  • 64
  • 81
  • The document is always in english. and the area I need to find the position for I really only need the Y coordinate due to the X coordinate always stays the same. The area is just higher or lower on the page depending on the individual items – scripter78 Oct 16 '12 at 12:45