1

I have like 200 PDF documents that will be saved in a folder (every day), I have to sort them based on a their content. (All pdf-documents have the string "X_P1" or "X_P2" in it)

My first step is to convert the .pdf file to a.txt files using XPDF:

for /r %%i in (*pdf) do "C:\Users\xxx\pdftotext.exe" -layout "%%i"

So I end up with 200 PDF files and 200 text files in a folder.

Looks like this:

p100.pdf
p100.txt
p101.pdf
p101.txt
...

So for the next step I thought of searching for the string "X_P1" in the .txt file with FINDSTR and save the filename as a variable. (e.g. p100) Next step: Move all files that name is the same as the variabel to a folder.

I'm not very familiar with batch/powershell so how can I work with the result from FINDSTR. I thought of maybe using the errorlevels? So if I get ERORRLEVEL 0 move to folder 1 .

Igor F.
  • 2,649
  • 2
  • 31
  • 39
janikoo5
  • 13
  • 6

1 Answers1

0

(All pdf-documents have the string "X_P1" or "X_P2" in it)

Sort them based on their content.

Convert a .pdf to text. If you find "X_P1" move to xp1 if not move to xp2.

@ECHO OFF 
SETLOCAL ENABLEEXTENSIONS
REM begin debug 
REM del "%userprofile%\Desktop\*.pdf" 2>nul 
REM rd /q /s xp1 2>nul 
REM rd /q /s xp2 2>nul
REM copy /y "%userprofile%\Desktop\New Folder\*.pdf" "%userprofile%\Desktop\" 1>nul
REM   end debug
md xp1 2>nul
md xp2 2>nul
for /f %%a in ('dir /b *.pdf') do (
pdftotext.exe -raw %%a tmp.txt
find "X_P1" tmp.txt > nul && move %%a xp1 || move %%a xp2
)
del tmp.txt
exit /b 
somebadhat
  • 744
  • 1
  • 5
  • 17