-2

I need to read the xref table in PDF, and replace all free (signed with 'f' at the end) with text string from file. This is the example of xref table in PDF.

xref

0 256

0000000029 65535 f

0000000017 00000 n

0000000125 00000 n

0000000216 00000 n

0000000030 65535 f

0000000031 65535 f

0000000032 65535 f

and I want to replace with string [A443DD719B11118D12D99E5EA18E5EA9934] then it'll become:

0000000029 65535 f

0000000017 00000 n

0000000125 00000 n

0000000216 00000 n

0000000030 A443D f

0000000031 D719B f

0000000032 11118 f

. . .

I'm working with iText or PDFBox in Java but can't find the way how to read or access the stream of xref table and replace it with text from a file. Please help.

Community
  • 1
  • 1
Anthony
  • 3
  • 3
  • 2
    As your desired change breaks the pdf, you can hardly expect it to be directly supported by a Pdf library, can you? – mkl Mar 14 '16 at 05:11
  • And what will you do with files that have an XRef stream instead of an xref table? – Tilman Hausherr Mar 14 '16 at 07:23
  • actually i want to make it as a watermark. the idea is i want to use the free object in xref because it can be replaced and have no change with it's PDF files. so i need to read the xref table and replaced it with files which is the watermark. can you help me? – Anthony Mar 14 '16 at 07:42
  • https://stackoverflow.com/questions/8929954/watermarking-with-pdfbox – Tilman Hausherr Mar 14 '16 at 10:14
  • *it can be replaced and have no change with it's PDF files.* - Other than make them *invalid*, you mean? – mkl Mar 14 '16 at 14:06

1 Answers1

0

Every indirect object in a PDF has a unique identifier as defined in ISO 32000-1:

The object identifier shall consist of two parts:

  • A positive integer object number. Indirect objects may be numbered sequentially within a PDF file, but this is not required; object numbers may be assigned in any arbitrary order.
  • A non-negative integer generation number. In a newly created file, all indirect objects shall have generation numbers of 0. Nonzero generation numbers may be introduced when the file is later updated.

This identifier is used in the cross-reference table, and there's a maximum number for generation numbers in an ordinary cross-reference stream:

The maximum generation number is 65,535

You want to change the generation number into something that isn't a number. And even if you'd see strings such as D719B as (hexadecimal) numbers, you'd still be exceeding the maximum generation number.

In other words: you ask PDF specialists to create PDFs that do not comply with the ISO standard. Every PDF expert with some self-respect will refuse to answer that question and ask you to reconsider.

In the comments, you claim that you want to introduce some invisible watermark into a PDF file. Why do you want to abuse the concept of generation numbers to do this? Why don't you just add an extra (custom) entry to the catalog?

Bruno Lowagie
  • 75,994
  • 9
  • 109
  • 165
  • Thank you very much for your answer. Pardon me if i have wrong perseption about byte generation number in cross-reference table. Yes, actually my idea to make invisible watermark, taking advantage from byte generation number that not in used (based on the description that byte generation number followed by flag 'f' is not in used). is it possible if i embed the watermark between the character space? and i still don't understand about add an extra entry to the catalog, can you explain it more detail, please? thank you in advance. – Anthony Mar 15 '16 at 04:11
  • Do not mess with the `xref` table! The number of bytes in the xref and the structure of the xref is defined in a very strict way! If you want to add an invisible watermark, use an extra entry in the catalog. Or use an extra entry in the info dictionary. Or add the watermark string as a PDF comment. Why would you obsessively insist on screwing up the xref table when there are so many alternatives? Accept this answer and post a new question if you don't understand the alternatives I mentioned. Don't abuse the comments of a correctly answered question to extend the question. – Bruno Lowagie Mar 15 '16 at 12:21
  • okay i understand sir. since i read the ISO 32000-1 i wondering that i have wrong perseption with xref table, thank you for your advise. but do you have any example to make an extra entry in the info dictionary sir? – Anthony Mar 17 '16 at 15:04
  • @Anthony Please post that as a new question after you've accepted this question. The whole concept of SO is to reward good answers with reputation points. People won't be inclined to answer your new question if you don't follow the rules of gamification. – Bruno Lowagie Mar 17 '16 at 15:16