1

I am trying to rename a lot of files. I only want to change the extention from ".pdf.OCR.pdf" to ".pdf" So far I got the following code

rem for /r myPDFfolder %%i in (*.pdf.OCR.pdf) do ren "%%i" "%%~ni.pdf"

But it does not appear to work with the extension that has multiple dots -- what am I doing wrong?

user1769925
  • 588
  • 1
  • 6
  • 15

3 Answers3

4

Extension is the part of file name after the last dot.

Use string replacement to strip the unneeded part:

setlocal enableDelayedExpansion
for /f "eol=* delims=" %%i in ('dir /s /b "r:\*.pdf.OCR.pdf"') do (
    set "name=%%~nxi"
    ren "%%i" "!name:.pdf.OCR=!"
)

P.S. Parsing of dir is used to make the code more robust in case a different text is stripped which might have changed the sorting order and cause for to process the file twice or more times.

wOxxOm
  • 65,848
  • 11
  • 132
  • 136
  • I'd use replacement `!file:.pdf.OCR.pdf=.pdf!` to fully comply with the original question... – aschipfl Aug 27 '15 at 16:01
  • 1
    That would be an overly literal interpretation. The goal was to *strip* the unneeded part. Actually an even shorter method might be used: `set name=%%~ni` and `ren "%%i" "!name:.OCR=!"`, so the code is already superfluous to be more obvious. – wOxxOm Aug 27 '15 at 16:35
  • ...okay, I agree... anyway, you should place `"` around the entire `set` expression to avoid trouble with special characters like `^` or `&` in the file names; and I recommend to enuerate the directory by `dir /S /B /A:-D` and parse its output by `for /F` rather than using `for /R` to ensure the directory tree to be enumerated `before` iterating through it; since you are modifying the tree in the `for /R` body, conflicts may arise (see my [post about that topic](http://stackoverflow.com/q/31975093/5047996))... – aschipfl Aug 27 '15 at 16:58
  • 1
    Thanks, I'll add quotes and use `dir` to make the code more robust for someone else's needs. However *luckily* in this particular case there was no need for `dir` because the new file name doesn't change the sorting order. – wOxxOm Aug 27 '15 at 17:12
  • @wOxxOm - Actually, the sort order is not important, especially since the list is not guaranteed to be processed in any particular order if it is on a fat file system. However, it still should be safe because the renamed file should not match the `*.pdf.ocr.pdf` file mask. – dbenham Aug 29 '15 at 02:30
  • ...unless it's `*.pdf.ocr.pdf.ocr.pdf`, @dbenham?? – aschipfl Aug 29 '15 at 10:57
1

There is no need for a batch file. A moderate length one liner from the command prompt can do the trick.

If you know for a fact that all files that match *.pdf.ocr.pdf have this exact case: .pdf.OCR.pdf, then you can use the following from the command line:

for /r "myPDFfolder" %F in (.) do @ren "%F\*.pdf.ocr.pdf" *O&ren "%F\*.pdf.o" *f

The first rename removes the trailing .pdf, and the second removes the .OCR. The above works because *O in the target mask preserves everything in the original file name through the last occurrence of upper-case O, and *f preserves through the last occurrence of lower-case f. Note that the characters in the source mask are not case sensitive. You can read more about how this works at How does the Windows RENAME command interpret wildcards?

If the case of .pdf.ocr.pdf can vary, then the above will fail miserably. But there is still a one liner that works from the command line:

for /r "myPDFfolder" %F in (*.pdf.ocr.pdf) do @for %G in ("%~nF") do @ren "%F" "%~nG"

%~nF lops off the last .pdf, and %~nG lops off the .OCR, which leaves the desired extension of .pdf.

You should not have to worry about a file being renamed twice because the result after the rename will not match *.pdf.ocr.pdf unless the original file looked like *.pdf.ocr.pdf.ocr.pdf.

If you think you might want to frequently rename files with complex patterns in the future, then you should look into JREN.BAT - a regular expression renaming utility. It is pure script (hybrid JScript/batch) that runs natively on any Windows machine from XP onward. Full documentation is embedded within the script.

Assuming JREPL.BAT is in a folder that is listed within your PATH, then the following simple command will work from the command line, only renaming files that match the case in the search string:

jren "(\.pdf)\.OCR\.pdf$" $1 /s /p "myPDFfolder"

If you want to ignore case when matching, but want to force the extension to be lower case, then:

jren "\.pdf\.ocr\.pdf$" ".pdf" /i /s /p "myPDFfolder"
Community
  • 1
  • 1
dbenham
  • 127,446
  • 28
  • 251
  • 390
0

Alternative solution, without delayed expansion (remove ECHO to actually rename any files):

@echo off
rem iterate over all matching files:
for /F "delims=" %%A in (
  'dir /S /B /A:-D "myPDFfolder\*.pdf.OCR.pdf"'
) do (
  rem "%%~nA" removes last ".pdf"
  for /F %%B in ("%%~nA") do (
    rem "%%~nB" removes ".OCR" part
    for /F %%C in ("%%~nB") do (
      rem "%%~nC" removes remaining ".pdf"
      ECHO ren "%%~fA" "%%~nC.pdf"
    ) & rem next %%C
  ) & rem next %%B
) & rem next %%A

NOTE: The directory tree is enumerated before for iterates through it because otherwise, some items might be skipped or tried to be renamed twice (see this post concerning that issue).

Community
  • 1
  • 1
aschipfl
  • 33,626
  • 12
  • 54
  • 99
  • 1
    You shouldn't have to worry about files being renamed twice in this case. See [my answer](http://stackoverflow.com/a/32282125/1012053) – dbenham Aug 29 '15 at 02:31