As already mentioned in comments, you don't use a single encoding to decode the whole byte array because each string object therein can be encoded differently.
You have to parse the byte array instruction by instruction, keep track of which font currently is selected, and when when you encounter a text drawing instruction, its string arguments have to be decoded according to the properties of that current font.
The properties to use may be its Encoding, its ToUnicode map, information from the underlying font file,... depending on which font type it is and which optional information are given.
But even after doing so, you cannot simply replace the text in the original pdf, this answer (to a similar question in the context of the PDFBox library) illustrates a number of hindrances, in particular fonts (which may be subset-embedded only) not containing the glyphs you need and unclear layout considerations.
To get an idea how to address the former issues, have a look at the following answers:
- This answer which provides
PdfContentStreamEditor
classes for Java and C# which can serve as base classes to edit content stream instructions; these classes in particular also keep track of the graphics state including the current text state parameters.
- This answer (the OP unfortunately deleted the question, so you need some reputation to have permission to read the answer) uses that
PdfContentStreamEditor
Java class to implement a text remover for text in a specific font and another one for text with a large font size.
- This answer uses that
PdfContentStreamEditor
C# class to implement a BigTextRemover
which recognizes text by its font size and removes it.
- This answer describes what to do to prevent
PdfContentStreamEditor
issues with rotated documents.
- This answer also describes what to do to prevent
PdfContentStreamEditor
issues with rotated documents and additionally fixes a bug in the PdfContentStreamEditor
.
- This answer uses that
PdfContentStreamEditor
Java class to implement an editor that changes the color of black text to green.
- This answer provides a port of the
PdfContentStreamEditor
to iText 7 / Java as PdfCanvasEditor
and shows example usages removing text by font name or font size and re-coloring black text to green.
- This answer uses that
PdfContentStreamEditor
C# class to implement a TextRemover
removing all text drawing instructions.
- This answer uses that
PdfContentStreamEditor
Java class to implement a SimpleTextRemover
which recognizes a search text in text drawing instructions, removes it, and returns the positions at which the text was removed (under some restrictions explained there). At those positions one then can draw new text.
Studying the PdfContentStreamEditor
from the first answer (with the fix from the fifth answer) and the SimpleTextRemover
you get an idea how to find text. The other answers might be interesting in general if you want to edit PDFs in different ways.
As far as replacing goes, consider that fonts may be incomplete and you, therefore, in general cannot simply replace the contents of the string arguments of text drawing instructions but instead may have to add a new font and switch fonts for the replacement text drawing instruction.