The issue in your PDF reminds very much of the issue discussed in the last section "Yet another issue with parent tree entries" in this answer to the question “Find Tag from Selection” is not working in tagged pdf? by fascinating coder:
In your parent tree you do not reference the actual parent structure element of the MCID but you reference a new structure tree node which claims to have the actual parent node from the structure hierarchy as its own parent (not actually being one of its kids) and also claims to have the MCID in question as kid.
Instead you should simply reference the actual parent structure element of the MCID.
As your question title asks how to heal inconsistent parent tree mappings in a PDF created by pdfBox, here an approach to fix your parent tree by rebulding the parent tree from the structure tree.
First recursively collect MCIDs and their parent structure tree elements by page, e.g. using a method like this:
void collect(PDPage page, PDStructureNode node, Map<PDPage, Map<Integer, PDStructureNode>> parentsByPage) {
COSDictionary pageDictionary = node.getCOSObject().getCOSDictionary(COSName.PG);
if (pageDictionary != null) {
page = new PDPage(pageDictionary);
}
for (Object object : node.getKids()) {
if (object instanceof COSArray) {
for (COSBase base : (COSArray) object) {
if (base instanceof COSDictionary) {
collect(page, PDStructureNode.create((COSDictionary) base), parentsByPage);
} else if (base instanceof COSNumber) {
setParent(page, node, ((COSNumber)base).intValue(), parentsByPage);
} else {
System.out.printf("?%s\n", base);
}
}
} else if (object instanceof PDStructureNode) {
collect(page, (PDStructureNode) object, parentsByPage);
} else if (object instanceof Integer) {
setParent(page, node, (Integer)object, parentsByPage);
} else {
System.out.printf("?%s\n", object);
}
}
}
(RebuildParentTreeFromStructure method)
with this helper method
void setParent(PDPage page, PDStructureNode node, int mcid, Map<PDPage, Map<Integer, PDStructureNode>> parentsByPage) {
if (node == null) {
System.err.printf("Cannot set null as parent of MCID %s.\n", mcid);
} else if (page == null) {
System.err.printf("Cannot set parent of MCID %s for null page.\n", mcid);
} else {
Map<Integer, PDStructureNode> parents = parentsByPage.get(page);
if (parents == null) {
parents = new HashMap<>();
parentsByPage.put(page, parents);
}
if (parents.containsKey(mcid)) {
System.err.printf("MCID %s already has a parent. New parent rejected.\n", mcid);
} else {
parents.put(mcid, node);
}
}
}
(RebuildParentTreeFromStructure helper method)
and then rebuild based on the collected information:
void rebuildParentTreeFromData(PDStructureTreeRoot root, Map<PDPage, Map<Integer, PDStructureNode>> parentsByPage) {
int parentTreeMaxkey = -1;
Map<Integer, COSArray> numbers = new HashMap<>();
for (Map.Entry<PDPage, Map<Integer, PDStructureNode>> entry : parentsByPage.entrySet()) {
int parentsId = entry.getKey().getCOSObject().getInt(COSName.STRUCT_PARENTS);
if (parentsId < 0) {
System.err.printf("Page without StructsParents. Ignoring %s MCIDs.\n", entry.getValue().size());
} else {
if (parentTreeMaxkey < parentsId)
parentTreeMaxkey = parentsId;
COSArray array = new COSArray();
for (Map.Entry<Integer, PDStructureNode> subEntry : entry.getValue().entrySet()) {
array.growToSize(subEntry.getKey() + 1);
array.set(subEntry.getKey(), subEntry.getValue());
}
numbers.put(parentsId, array);
}
}
PDNumberTreeNode numberTreeNode = new PDNumberTreeNode(PDParentTreeValue.class);
numberTreeNode.setNumbers(numbers);
root.setParentTree(numberTreeNode);
root.setParentTreeNextKey(parentTreeMaxkey + 1);
}
(RebuildParentTreeFromStructure method)
Applied like this
PDDocument document = PDDocument.load(SOURCE));
rebuildParentTree(document);
document.save(RESULT);
(RebuildParentTreeFromStructure test testTestdatei
)
PAC3 and Adobe Preflight (at least of my old Acrobat 9.5) go all green for the result:


Beware: This is no generic parent tree rebuilder yet. It is made to work for the test file at hand with a specific kind of structure tree nodes and content only in page content streams. For a generic tool it has to learn to cope with other kinds, too, and to also process e.g. marked content in embedded XObjects.