The script below is able to remove all images from a PDF file using CAM::PDF
. The output, however, is corrupt. PDF readers are nonetheless able to open it, but they complain about errors. For instance, mupdf
says:
error: no XObject subtype specified
error: cannot draw xobject/image
warning: Ignoring errors during rendering
mupdf: warning: Errors found on page
Now, CAM::PDF
page at CPAN (here) lists the deleteObject()
method under "Deeper utilities", presumably meaning that it's not intended for public usage. Moreover, it warns that:
This function does NOT take care of dependencies on this object.
My question is: what is the right way to remove objects from a PDF file using CAM::PDF
? If the issue has to do with dependencies, how can I remove an object while taking care of its dependencies?
For how to remove images from a PDF using other tools, see a related question here.
use CAM::PDF;
my $pdf = new CAM::PDF ( shift ) or die $CAM::PDF::errstr;
foreach my $objnum ( sort { $a <=> $b } keys %{ $pdf->{xref} } ) {
my $xobj = $pdf->dereference ( $objnum );
if ( $xobj->{value}->{type} eq 'dictionary' ) {
my $im = $xobj->{value}->{value};
if
(
defined $im->{Type} and defined $im->{Subtype}
and $pdf->getValue ( $im->{Type} ) eq 'XObject'
and $pdf->getValue ( $im->{Subtype} ) eq 'Image'
)
{
$pdf->deleteObject ( $objnum );
}
}
}
$pdf->cleanoutput ( '-' );