I'm trying to use PDFBOX 2.0 to replace empty or delete a text pattern, (in my case i want to remove all "[QR]" words from all PDF), but I can't find anything that works for me.
I tried itext, but the same, nothing works.
The "[QR]" string from my pdf were edited after the PDF was created, maybe that's why they don't appear as tj operators?
My main:
replaceText(documentoPDF, "[QR]", "");
My method (i printed Tj values and my pattern dont appear there):
public void replaceText(PDDocument documentoPDF, String searchString, String replacement) throws IOException{
for ( PDPage page : documentoPDF.getPages()){
PDFStreamParser parser = new PDFStreamParser(page);
parser.parse();
List<?> tokens = parser.getTokens();
for (int j = 0; j < tokens.size(); j++){
Object next = tokens.get(j);
if (next instanceof Operator){
Operator op = (Operator) next;
String pstring = "";
int prej = 0;
//Tj and TJ are the two operators that display strings in a PDF
if (op.getName().equals("Tj"))
{
// Tj takes one operator and that is the string to display so lets update that operator
COSString previous = (COSString) tokens.get(j - 1);
String string = previous.getString();
string = string.replaceFirst(searchString, replacement);
previous.setValue(string.getBytes());
} else
if (op.getName().equals("TJ"))
{
COSArray previous = (COSArray) tokens.get(j - 1);
for (int k = 0; k < previous.size(); k++)
{
Object arrElement = previous.getObject(k);
if (arrElement instanceof COSString)
{
COSString cosString = (COSString) arrElement;
String string = cosString.getString();
if (j == prej) {
pstring += string;
} else {
prej = j;
pstring = string;
}
}
}
System.out.println(pstring.trim());
if (searchString.equals(pstring.trim()))
{
COSString cosString2 = (COSString) previous.getObject(0);
cosString2.setValue(replacement.getBytes());
int total = previous.size()-1;
for (int k = total; k > 0; k--) {
previous.remove(k);
}
}
}
}
}
// now that the tokens are updated we will replace the page content stream.
PDStream updatedStream = new PDStream(documentoPDF);
OutputStream out = updatedStream.createOutputStream(COSName.FLATE_DECODE);
ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
tokenWriter.writeTokens(tokens);
out.close();
page.setContents(updatedStream);
}
documentoPDF.save("resources\\resultado\\nuevo.pdf");
}
This is an example of pdf with some [QR] patterns: http://www.mediafire.com/file/9w3kkc4yozwsfms/file
If someone can help, i will appreciate it.
I can upload my entire project if you need
Thanks in advance.