I am currently trying to replace a placeholder string in an pdf programmatically. ( easy example: I want to change the string "SEI" to "1") I can currently access the content of the pdf and convert that to a stream and/or buffer and convert that buffer back to a pdf, but since im currently not really able to manipulate that stream/buffer correctly i am basically only copying that pdf right now. When i user buffer.toString() and just use a string replace on that for "SEI" to "1", it changes the buffer in the way that it now holds the value "1" where "SEI" was previously, but it doesnt display in the pdf correctly (it only shows a ? character in a square) probably because im not manipulating the buffer correctly.
I am using hummus.js for accessing the pdf data The font of the relevant placeholder is "Frutiger Next Pro Bold" (if that matters)
Code:
async function replacetext(filePath) {
const modPdfWriter = hummus.createWriterToModify(filePath, {modifiedFilePath: `${filePath}-modified.pdf`, compress: false})
const numPages = modPdfWriter.createPDFCopyingContextForModifiedFile().getSourceDocumentParser().getPagesCount()
for (let page = 0; page < numPages; page++) {
const copyingContext = modPdfWriter.createPDFCopyingContextForModifiedFile()
const objectsContext = modPdfWriter.getObjectsContext()
const pageObject = copyingContext.getSourceDocumentParser().parsePage(page)
const textStream = copyingContext.getSourceDocumentParser().queryDictionaryObject(pageObject.getDictionary(), 'Contents')
const textObjectID = pageObject.getDictionary().toJSObject().Contents.getObjectID()
let data = []
const readStream = copyingContext.getSourceDocumentParser().startReadingFromStream(textStream)
while (readStream.notEnded()) {
const readData = readStream.read(10000)
data = data.concat(readData)
}
var redactedPdfPageAsString = new Buffer.from(data).toString();
// var replacedBuffer = redactedPdfPageAsString.replace("SEI", "1");
var replacedBuffer = replace(redactedPdfPageAsString, "SEI", "1");
objectsContext.startModifiedIndirectObject(textObjectID)
const stream = objectsContext.startUnfilteredPDFStream();
stream.getWriteStream().write(strToByteArray(replacedBuffer));
objectsContext.endPDFStream(stream);
objectsContext.endIndirectObject();
}
modPdfWriter.end()
hummus.recrypt(`${filePath}-modified.pdf`, filePath)
}
I also tried node packages like stream-replace or buffer-replace but they were not working.
This is a cutout of the buffer, where also the string "SEI" is contained:
/Span <</Lang (de-DE)/MCID 0 >>BDC BT 0 0 0 1 k /GS0 gs /T1_0 1 Tf 10 0 0 10 25.5118 814.9606 Tm (SEI)Tj ET EMC /Span <</Lang (de-DE)/MCID 1 >>BDC BT 10 0 0 10 39.5317 814.9606 Tm (-)Tj ET EMC /Span <</Lang (de-DE)/MCID 2 >>BDC BT 0 1 1 0 k /