0

Unfortunately, the R CairoPDF device inserts an "/S /Transparency" into its output, even if no transparency is used. In turn, this invalidates the resulting graph as not being pdf/a. (In turn, kdp no longer accepts pdf files embedding these graphs.)

The good news is that I can decompress the R output via qpdf R-output.pdf --qdf editable.pdf, then locate the code

%% Original object ID: 2 0
4 0 obj
<<
  /Contents 5 0 R
  /Group <<
    /CS /DeviceRGB
    /S /Transparency
    /I true
    /Type /Group
  >>
  /MediaBox [
    0
    0
    432
    432
  ]
  /Parent 3 0 R
  /Resources 7 0 R
  /Type /Page
>>
endobj

and remove the /S /Transparency line. (What do /S and /Transparency mean?)

unfortunately, although the Skim macos pdf viewer happily still displays the resulting pdf file correctly, others (and validator VeraPDF) tell me that the edited pdf file is no longer valid and can thus no longer be parsed.

I tried a couple of other deletions, but never managed to figure out what it takes to remove the transparency and still retain the validity of the pdf file.

could someone please tell me whether this can be done (and recompressed)?

UPDATE : Full pdf file:

%PDF-1.5
%ø˜¢˛
%QDF-1.0

%% Original object ID: 7 0
1 0 obj
<<
  /Pages 3 0 R
  /Type /Catalog
>>
endobj

%% Original object ID: 6 0
2 0 obj
<<
  /CreationDate (D:20230414233642-07'00)
  /Producer (cairo 1.16.0 \(https://cairographics.org\))
>>
endobj

%% Original object ID: 1 0
3 0 obj
<<
  /Count 1
  /Kids [
    4 0 R
  ]
  /Type /Pages
>>
endobj

%% Page 1
%% Original object ID: 2 0
4 0 obj
<<
  /Contents 5 0 R 
  /Group <<
    /CS /DeviceRGB
    /I true
    /S /Transparency
    /Type /Group
  >>
  /MediaBox [
    0
    0
    432
    432
  ]
  /Parent 3 0 R
  /Resources 7 0 R
  /Type /Page
>>
endobj

%% Contents for page 1
%% Original object ID: 4 0
5 0 obj
<<
  /Length 6 0 R
>>
stream
1 0 0 -1 0 432 cm
q
1 1 1 rg /a0 gs
0 0 432 432 re f
0 0 0 RG 1 w
1 J
1 j
[] 0.0 d
10 M 233.602 208.801 m 233.602 213.066 227.199 213.066 227.199 208.801 c 227.199
 204.535 233.602 204.535 233.602 208.801 c S
Q
endstream
endobj

6 0 obj
211
endobj

%% Original object ID: 3 0
7 0 obj
<<
  /ExtGState <<
    /a0 <<
      /CA 1
      /ca 1
    >>
  >>
>>
endobj

xref
0 8
0000000000 65535 f 
0000000052 00000 n 
0000000133 00000 n 
0000000280 00000 n 
0000000389 00000 n 
0000000660 00000 n 
0000000926 00000 n 
0000000973 00000 n 
trailer <<
  /Info 2 0 R
  /Root 1 0 R
  /Size 8
  /ID [<9f36ebb4b60ad561646b02098fc82dcf><9f36ebb4b60ad561646b02098fc82dcf>]
>>
startxref
1058
%%EOF

created by R:

CairoPDF("editme.pdf")
plot(1, axes=F, xlab="", ylab=""))
dev.off

and then qpdf editme.pdf --qdf editmeasc.pdf. (I now also tried removing an entire group.)

an easy way to verify the problem is

\verapdf --verbose --format text editmeasc.pdf
Apr 14, 2023 11:43:47 PM org.verapdf.processor.ProcessorImpl process
WARNING: /Users/ivo/Library/CloudStorage/Dropbox/Apps/Overleaf/global-climate-change/plots/editmeasc.pdf doesn't appear to be a valid PDF.
/Users/ivo/Library/CloudStorage/Dropbox/Apps/Overleaf/global-climate-change/plots/editmeasc.pdf does not appear to be a valid PDF file and could not be parsed.%                       
ivo Welch
  • 2,427
  • 2
  • 23
  • 31
  • 2
    The /Group is a transparency group. So remove the whole /Group entry. If still no luck, can you provide a small example file which exhibits this problem? – johnwhitington Apr 14 '23 at 21:03
  • 1
    I took your "qdf" file, and reconstructed it using `cpdf in.pdf -decompress -o fixed.pdf`. Then I ran VeraPDF, which gave four errors, one to do with transparency. So I removed `/Group<>` and then ran it through `cpdf` again. The result still fails VeraPDF, but nothing to do with transparency. Recompressing is as simple as running it through `cpdf` without `-decompress`. You could round-trip it through JSON using `-output-json` and `-j` if you wish to do the group-removal automatically with any program which can process JSON. – johnwhitington Apr 15 '23 at 10:53
  • mille grazie. KJ. (not sure what 'cpdf' is.) the steps that did work are [1] `qpdf in.pdf -qdf editable.pdf`; [2] take out the entire `/Group << /Type /Group /S /Transparency /I true /CS /DeviceRGB >>` from editable.pdf. this file still fails verapdf as a non-valid pdf. [3] run `qpdf editable.pdf final.pdf` ignoring the warnings. Hooray: final.pdf passes verapdf in being valid , thoughstill not being pdf/a. it is now a non-pdf/a compliant pdf but without transparency. this is all I hope I will need for my printer's requirements. could you answer, so I can accept this as your soln? – ivo Welch Apr 15 '23 at 20:34
  • I should add that ghostscript pdfwrite device to create pdf/a make raster images at a particular dpi. this makes them much less desirable for many reasons than the hack removing the transparency from CairoPDF R generated pdf. – ivo Welch Apr 15 '23 at 21:58
  • other answer is at https://stackoverflow.com/questions/76011224/is-there-an-r-4-2-1-pdf-a-compliant-pdf-output-device . – ivo Welch Apr 17 '23 at 06:59

0 Answers0