How to find the title for a PDF from the metadata?

Question

How can I get the title for a PDF file after having renamed the file itself?

PSPath              : Microsoft.PowerShell.Core\FileSystem::/home/nicholas/to/99.pdf
PSParentPath        : Microsoft.PowerShell.Core\FileSystem::/home/nicholas/to
PSChildName         : 99.pdf
PSDrive             : /
PSProvider          : Microsoft.PowerShell.Core\FileSystem
PSIsContainer       : False
Mode                : -----
ModeWithoutHardLink : -----
VersionInfo         : File:             /home/nicholas/to/99.pdf
                      InternalName:     
                      OriginalFilename: 
                      FileVersion:      
                      FileDescription:  
                      Product:          
                      ProductVersion:   
                      Debug:            False
                      Patched:          False
                      PreRelease:       False
                      PrivateBuild:     False
                      SpecialBuild:     False
                      Language:         
                      
BaseName            : 99
Target              : 
LinkType            : 
Length              : 592483
DirectoryName       : /home/nicholas/to
Directory           : /home/nicholas/to
IsReadOnly          : False
FullName            : /home/nicholas/to/99.pdf
Extension           : .pdf
Name                : 99.pdf
Exists              : True
CreationTime        : 2/19/2021 11:45:18 PM
CreationTimeUtc     : 2/20/2021 7:45:18 AM
LastAccessTime      : 2/20/2021 2:02:36 AM
LastAccessTimeUtc   : 2/20/2021 10:02:36 AM
LastWriteTime       : 2/19/2021 11:45:18 PM
LastWriteTimeUtc    : 2/20/2021 7:45:18 AM
Attributes          : Normal


PS /home/nicholas/to> 
PS /home/nicholas/to> Get-ChildItem -Path ./ –File | Select-Object -Property *

This is to bulk import PDF files into calibre, which, notably, seems to recognize duplicates and even displays some titles. Is it parsing the PDF file itself, or gleaning this from meta-data?

Are you trying to get the filename ? example : book.pdf => PDF Title = book ? — , Feb 20 '21 at 11:26

score 1 · Answer 1 · answered Feb 21 '21 at 15:17

For this, you can use pdfinfo.exe which you can find as part of the free Xpdf command line tools.

After you have downloaded and extracted the zip file, copy pdfinfo.exe to some directory and make sure you unblock it, either by right-click or by using PowerShell

Unblock-File -Path 'Where\Ever\You\Have\Copied\It\To\pdfinfo.exe'

Using that, to get the original title as stored in the pdf, you do

$title = ((& 'D:\Test\pdfinfo.exe' 'D:\Test\test.pdf' | 
    Where-Object { $_ -match '^Title:' }) -split ':', 2)[-1].Trim()

I'm on Linux, but this seems like the right approach – Nicholas Saunders Feb 21 '21 at 22:53 — Nicholas Saunders, Feb 21 '21 at 22:53

How to find the title for a PDF from the metadata?

1 Answers1