I want to split TJ/Tj operator's COSString using the PDFBOX.
My pdf current content stream looks like below.
Desired output
or
what I tried?
public static void SplitTj_TJ(int tj_ind, PDDocument document) throws IOException{
PDPage page = document.getPage(0);
PDFStreamParser parser = new PDFStreamParser(page);
parser.parse();
List tokens = parser.getTokens();
Operator op = (Operator) tokens.get(tj_ind);
COSFloat dest_x = new COSFloat((float) 90.81199646);
COSFloat dest_y = new COSFloat((float) 0);
if ( tokens.get(tj_ind) instanceof Operator && (op.getName().equals("TJ") || op.getName().equals("Tj"))){
COSArray tj_array = (COSArray) tokens.get(tj_ind-1);
tokens.remove(tj_ind);
tokens.remove(tj_ind-1);
tokens.add((int) (tj_ind-1), tj_array.get(0));
tokens.add((int) (tj_ind), Operator.getOperator("Tj"));
tj_array.remove(0);
tokens.add((int) (tj_ind+1), dest_x);
tokens.add((int) (tj_ind+2), dest_y);
tokens.add((int) (tj_ind+3), Operator.getOperator("Td"));
tokens.add((int) (tj_ind+4), tj_array.get(1));
tokens.add((int) (tj_ind+5), Operator.getOperator("Tj"));
tokens.remove(tj_ind+9);
tokens.add((int) (tj_ind+9), new COSFloat((float) -90.81199646));
System.out.println("!@#$%^&*(*&^@#$%^&^$#@#$%^&^$#@#$%^%$#@#$%^%#@#$%^%#@#^");
PDStream newContents = new PDStream(document);
OutputStream out = newContents.createOutputStream(COSName.FLATE_DECODE);
ContentStreamWriter writer = new ContentStreamWriter(out);
writer.writeTokens(tokens);
System.out.println("Count at end :::::"+tokens.size());
out.close();
document.getPage(0).setContents(newContents);
PDDocument pdf = new PDDocument();
pdf.addPage(document.getPage(0));
pdf.save("D:/Testfiles/brigs11.pdf");
pdf.close();
}
}
I am not sure this will work for all cases. What is the generic code to make it work .
How can I achieve this using PDFBOX. I can able to split all the TJ/Tj's under the all type of text position operators without messing up the existing stream?