6

I am trying to read PDF document properties from nodeJS. I couldn't find any node module for reading document properties. I am able to read the file metadata by using file-metadata but its only giving basic properties. I want to read the properties like Document restriction summary(please check attached image for reference. enter image description here

JosephA
  • 1,187
  • 3
  • 13
  • 27
naresh
  • 627
  • 1
  • 7
  • 28

2 Answers2

5

Inspired by @DietrichvonSeggern's suggestion I wrote small node script.

const { spawnSync } = require('child_process');

const { stdout } = spawnSync('exiftool',
  ['-b', '-UserAccess', 'test.pdf'],
  { encoding: 'ascii' });
const bits = (parseInt(stdout, 10) || 0b111111111110);

const perms = {
  'Print': 1 << 2,
  'Modify': 1 << 3,
  'Copy': 1 << 4,
  'Annotate': 1 << 5,
  'Fill forms': 1 << 8,
  'Extract': 1 << 9,
  'Assemble': 1 << 10,
  'Print high-res': 1 << 11
};

Object.keys(perms).forEach((title) => {
  const bit = perms[title];
  const yesno = (bits & bit) ? 'YES' : 'NO';
  console.log(`${title} => ${yesno}`);
});

It will print something like:

Print => YES
Modify => NO
Copy => NO
Annotate => NO
Fill forms => NO
Extract => NO
Assemble => NO
Print high-res => YES

You should have exiftool installed in your system, and add needed error checks to this script.

ExifTool UserAccess tag reference.


Slightly modified:

const perms = {
  'Print': 1 << 2,
  'Modify': 1 << 3,
  'Copy': 1 << 4,
  'Annotate': 1 << 5,
  'FillForms': 1 << 8,
  'Extract': 1 << 9,
  'Assemble': 1 << 10,
  'PrintHighRes': 1 << 11
};

const access = {};
Object.keys(perms).forEach((perm) => {
  const bit = perms[perm];
  access[perm] = !!(bits & bit);
});

console.log(access);

Will produce:

{
  Print: true,
  Modify: false,
  Copy: false,
  Annotate: false,
  FillForms: false,
  Extract: false,
  Assemble: false,
  PrintHighRes: true
}
Styx
  • 9,863
  • 8
  • 43
  • 53
3

Have you considered using exiftool? You would have to integrate it into nodejs, but afaics it provides more or less all data you are looking for.

  • That's a really good suggestion. `exiftool` can extract [`UserAccess` tag](https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/PDF.html) that holds this information. – Styx Jan 18 '19 at 14:58