1

So, I have a program which is kind of a text editor. I need it's output format to be pdf, yet again I need to be able to edit that PDF again. Since my programs output is never very complicated, and since my program is the one who creates the PDF, I could read directly from created PDF, but I thought it would be easier to just attach another file to PDF which will be easier to read.

However, I don't want the user to see that a file is attached to the PDF.

I have read once somewhere that you can trick PDF readers by changing /EmbeddedFiles to /Embeddedfiles. That way they will not detect there are files attached to PDF they are processing.

The question is, how can I read the PDF in order to do that change and then again prior to editing to revert it back?

I don't think PDF libraries would help me here, since I'm trying to "corrupt" the PDF. I guess I should parse it as somekind of string and then look for the substring I want to change. But I'm too unfamiliar with the PDF format to know if it's really that simple or is there a specific way to do that...

Karlovsky120
  • 6,212
  • 8
  • 41
  • 94
  • :( too localized.... can you show us some code too? – Sap Sep 06 '12 at 11:16
  • I haven't written anything... I'm just asking how to read a PDF as string that can be edited, saved as a PDF file and be editable again. But since I'm messing with core structure of PDF, I don't expect PDF libraries to have support for that. If ist's nay help, the file shoudl always be attached somewhere to the top of the document, it would be same for every document created by my program... – Karlovsky120 Sep 06 '12 at 11:20
  • Just being curious: Why do you want to hide the attachment if it´s so essential for the functionality you are trying to offer? – TheBlastOne Sep 06 '12 at 11:41
  • have you met http://stackoverflow.com/questions/4131031/editing-pdf-text-using-java – Sreenath S Sep 06 '12 at 12:02
  • I want it to be hidden from the end-user, I want the attachment to be something that only my program will be able to see. Also, I'm not trying to edit the CONTENT of PDF but PDF ITSELF! But I came up with alternative, hidding a file inside an image, and then adding image to the PDF... Only question is how to hide it as well as it can be hidden, I'll post a question in 4 hours... – Karlovsky120 Sep 06 '12 at 12:03
  • 1
    OK, you are not interested in editing the contents of PDF. But you edit the file itself, and embed it as an attached file ? I'm confused. Why are you doing this in such a complicated way ? Question is very unclear my friend. Could you describe what exactly do you edit in this PDF file and how do other users access this PDF file. – Sreenath S Sep 06 '12 at 12:17
  • 1
    I attach a file to PDF using PDF libraray. Then I hide it using some java's method to edit the whole file as a bytestream or however should it be done. And then I end up with an output file that the user can open with any PDF reader. – Karlovsky120 Sep 06 '12 at 12:22
  • How are you serving these PDF files to users ? Is it direct access or is it through a server ? – Sreenath S Sep 06 '12 at 12:43
  • Direct access. It's a program that user can use to generate certain PDF, and I want them to be able to share them without being able to (without having a hard time) edit them, and to be able to open them without haveing my program. – Karlovsky120 Sep 06 '12 at 14:14

2 Answers2

2

PDF isn't a format meant for editing and tacking on an attachment (hidden or not which I'm not even sure will work) is kind of iffy. Assuming your trick works:

  • Is this a valid PDF? You may want to trick readers, but you'd be creating invalid PDFs, which worries me more than the method you're trying to use.

  • What if a PDF reader updates its functionality to support invalid syntax? That would mean all of a sudden your file is visible, defeating your intentions.

The best way would be:

Let the user create its document. Store the text in a program folder. Create a PDF. When editing, just load the text document (or whatever) based on the PDF's title. Once again, PDF is not an editing format.

Or use Jonathan's solution. Which works around storing the text locally.

Either way, corrupting a PDF file is not desirable.

Michaël Demey
  • 1,567
  • 10
  • 18
  • I have just found another way. I can open the PDF in text editor and add `%` to beggining of every line which is related to the attachment, commenting it out. That would probably be a better method then one in the question... However, I still don't know how to do it with Java.. Do I just open PDF like nay other text file, or something else? – Karlovsky120 Sep 06 '12 at 13:07
  • @Karlovsky120 "I can open the PDF in text editor and add % to beggining of every line". No, you cannot do that. PDF is a binary format, if you do something like that to a PDF file it will become corrupt and unreadable for sure. – yms Sep 06 '12 at 13:47
  • EXACTLY. So how DO I poen it in Java in order for it not to become corrupt? P.S. Okay, not text editor, I edited it using Eclipse... – Karlovsky120 Sep 06 '12 at 14:09
1

If you just one to create your own version of a binary format and just call it PDF, then you can try adding a "custom" entry to any dictonary object of your PDF file, and associate a data stream to that entry. Since the entry will be outside the PDF spec, all (well implemented) readers should be able to ignore it.
You can probably do this with iText using PdfDictionary.put, and you could add your non-stanard data to the Catalog dictionary for example.

yms
  • 10,361
  • 3
  • 38
  • 68