Unfortunately, there are no methods for directly retrieving the total pages from a PDF file using Google APIs yet. So how about these workarounds? Please choose it for your situation.
Workaround 1:
In this workaround, it retrieves the number of content streams in the PDF file. The content streams is shown as the attribute of /Contents
.
When this is reflected to your script, it becomes as follows.
Modified script:
function getNumberofPages() {
var myFolder = DriveApp.getFoldersByName("Test").next();
var files = myFolder.searchFiles('title contains ".PDF"');
while (files.hasNext()) {
var file = files.next();
var n = file.getBlob().getDataAsString().split("/Contents").length - 1;
Logger.log("fileName: %s, totalPages: %s", file.getName(), n)
}
}
- Although this workaround is simple, it might be able to not use for all PDF files as @mkl says. If this workaround cannot be used for your PDF files, how about the following workaround 2?
Workaround 2:
In this workaround, an API is used for retrieving the total pages of PDF file. I used Split PDF API. The total pages are retrieved from the number of splitted files. When you use this API, please check ConvertAPI and retrieve your secret key.
Modified script:
function getNumberofPages() {
var myFolder = DriveApp.getFoldersByName("Test").next();
var files = myFolder.searchFiles('title contains ".PDF"');
while (files.hasNext()) {
var file = files.next();
var url = "https://v2.convertapi.com/convert/pdf/to/split?Secret=#####"; // Please set your secret key.
var options = {
method: "post",
payload: {File: DriveApp.getFileById(file.getId()).getBlob()},
}
var res = UrlFetchApp.fetch(url, options);
res = JSON.parse(res.getContentText());
Logger.log("fileName: %s, totalPages: %s", file.getName(), res.Files.length)
}
}
- I'm not sure about the number of PDF files and file size. So I didn't use fetchAll method for this. This is a sample script. So please modify this for your situation.
Note:
- I can use these workarounds in my applications. But I have not been able to confirm for all PDF files. So if these workarounds didn't work for your PDF files, I'm sorry.
Reference:
Workaround 3:
As another approach, when this method is used, the sample script for retrieving the number of pages of PDF data is as follows.
async function myFunction() {
const cdnjs = "https://cdn.jsdelivr.net/npm/pdf-lib/dist/pdf-lib.min.js";
eval(UrlFetchApp.fetch(cdnjs).getContentText()); // Load pdf-lib
const setTimeout = function (f, t) {
// Overwrite setTimeout with Google Apps Script.
Utilities.sleep(t);
return f();
};
const myFolder = DriveApp.getFoldersByName("Test").next();
const files = myFolder.searchFiles('title contains ".PDF"');
const ar = [];
while (files.hasNext()) {
ar.push(files.next())
}
for (let i = 0; i < ar.length; i++) {
const file = ar[i];
const pdfData = await PDFLib.PDFDocument.load(new Uint8Array(file.getBlob().getBytes()));
const n = pdfData.getPageCount();
console.log("fileName: %s, totalPages: %s", file.getName(), n);
}
}
Note: