0

I am trying to combine pdfs based on filename (IMG_2.pdf) with all pdfs located in a subfolder that has the same name as the pdf with padding (IMG_2.pdfX).

Subfolder can have additional subfolders (like 22.pdfX) named after pdf in the first subfolder (22.pdfX) and has additional pdfs (like IMGk.pdf).

C:.
│ IMG_2.pdf
├───IMG_2.pdfX
│ │ 22.pdf
│ └───22.pdfX⠀
│ IMGk.pdf

This shows tree for one pdf and its subfolder, there are hundreds of PDF in with such structure.

Output should be one pdf for IMG_2.pdf, that combines all other pdf into this or new file, and so on for all other pdf in source directory (pdf).

Tried pdftk but got stuck. Any solution on linux or windows is welcome, windows is preferable.

Shod Der
  • 1
  • 1
  • Please provide a [mcve] of the code you have written to perform the task you've laid out in your question, but which is failing to do so. Along side it, please provide any error codes and other debugging information. we can use to assist in helping you. – Compo Mar 16 '21 at 22:20
  • You can't do this with a batch file alone. You will have to use a script of some kind. What do you know? Python? Perl? Awk? – Tim Roberts Mar 16 '21 at 22:35

1 Answers1

0

to answer my self after much trials and tribulations

@echo off    
rem run cmd with \u -for utf 16    
setlocal EnableDelayedExpansion EnableExtensions    
SET "SourceParentDir=C:\Export"    
set back=%cd%    
    
set "files="    
set "originalPDF="    
    
rem loop through subdirectories    
for /d %%i in (%SourceParentDir%\*) do (    
        
    rem get filepath of master pdf    
    set "originalPDFpath=%%~i"    
    set "originalPDFpath=!originalPDFpath:~0,-1!.pdf"    
        
    rem get filename of master pdf    
    set "originalPDF=%%~nxi"    
    set "originalPDF=!originalPDF:~0,-1!.pdf"    
        
    rem traverse subdirectories    
    cd "%%i"    
        
    rem get list of pdf files    
    (for /R %%a in (*.pdf) do (    
        set "files=!files! "%%~fa""    
    )    
        
    rem combine pdf files    
    pdftk "!originalPDFpath!" !files! cat output "c:\test\!originalPDF!"        
    set "files="    
    )    
)    
cd %back%    

main sources:

Shod Der
  • 1
  • 1