2

I have 100's of .xml files (TV show based) that are named sequentially like so:

s07e01.xml
s07e02.xml
s07e03.xml
s07e04.xml

The season and number of episodes (per season) differ.

In each file there are two lines:

<ID></ID>
<EpisodeNumber></EpisodeNumber>

Is it possible to batch edit these files adding the episode number to these two elements?

Thanks.

dbenham
  • 127,446
  • 28
  • 251
  • 390

4 Answers4

1

Here is a bash script:

#! /bin/bash

for f in *.xml ; do
    n=${f##*/s}; n=${n#*e}; n=${n%.xml}
    echo "File $f --> episode $n" >&2
    mv -f "$f" "$f.bak"
    while IFS= read -r line ; do
        if [[ "$line" == *"<ID>"*"</ID>"* ]]; then
            echo -e "${line%%[^ ]*}<ID>$n</ID>\r"
        elif [[ "$line" == *"<EpisodeNumber>"*"</EpisodeNumber>"* ]]; then
            echo -e "${line%%[^ ]*}<EpisodeNumber>$n</EpisodeNumber>\r"
        else
            echo -e "$line\r"
        fi
    done < "$f.bak" >| "$f"
done
Edouard Thiel
  • 5,878
  • 25
  • 33
  • Sorry, I should have mentioned I'm using Windows 7. – Heather Brown Nov 09 '14 at 15:57
  • You may install cygwin and run the script in the cygwin console. – Edouard Thiel Nov 09 '14 at 16:04
  • Google just informed me of that. Edouard a big thank you for the help. You wrote that script faster than I could explain myself. I will give it a try. – Heather Brown Nov 09 '14 at 16:08
  • 2
    If she had wanted to use bash, she would have added the bash tag. – SomethingDark Nov 09 '14 at 16:14
  • It works in the current directory. If you want to run it in subdirectories, replace `for f in *.xml` by `for f in */*.xml` – Edouard Thiel Nov 09 '14 at 16:14
  • @SomethingDark yes but bash is so handy. – Edouard Thiel Nov 09 '14 at 16:16
  • Edouard thanks again... I don't mean to push your suggestion to the side but... SomethingDark if you could convert what Edouard suggested for me, I would rather not install Cygwin. – Heather Brown Nov 09 '14 at 16:21
  • Edouard I finally figured out how to make the script work. It works as it should but... It completely changes the format of the file. There is many other elements within the file and is formatted a certain way. The script combines all elements into one line. – Heather Brown Nov 09 '14 at 18:46
  • @HeatherBrown ok I have fixed the script to keep the spaces and lines. – Edouard Thiel Nov 09 '14 at 19:20
  • Your fix keeps all other lines formatted perfectly but... I just noticed that your original and modified script deletes the very last line in the file. – Heather Brown Nov 09 '14 at 19:53
  • I reinstalled Cygwin from a different download site and all is perfect. Your script will save me many hours of manual editing. Thank you so much. – Heather Brown Nov 09 '14 at 22:00
  • Edouard one last thing. Sometimes the file name can have the title of the episode, e.g. s02e01 - show one, s02e01 - show two. Right now the script will input 01 - show one. Is there something I could change in the script to avoid all characters after episode number? – Heather Brown Nov 09 '14 at 22:27
  • Edouard, I wanted to thank you for opening up the world of bash scripting to me. Though you provided a solution to what I had asked, I'm gonna go with dbenham's solution. He, so to speak, jumped up to first and won the race. ;) – Heather Brown Nov 10 '14 at 14:13
1
@echo off
setlocal EnableDelayedExpansion

rem Process all .xml files
for %%f in (*.xml) do (
   rem Get season and episode in %%a and %%b
   for /F "tokens=1,2 delims=se." %%a in ("%%f") do (
      rem Get the numbers of both target lines
      set "repLines=/"
      for /F "delims=:" %%c in ('findstr "<ID> <EpisodeNumber>" "%%f"') do (
         set "repLines=!repLines!%%c/"
      )
      rem Initialize the (first) replacement string
      set "replace=<ID>%%a</ID>"
      rem Process the file, replace values, create new file
      (for /F "tokens=1* delims=:" %%c in ('findstr /N "^" "%%f"') do (
         rem If this is a target line
         if "!repLines:/%%c/=!" neq "!repLines!" (
            rem Do the replacement
            echo !replace!
            rem And change to next (second) replacement string
            set "replace=<EpisodeNumber>%%b</EpisodeNumber>"
         ) else (
            rem Output the line unchanged
            setlocal DisableDelayedExpansion
            set "line=%%d"
            setlocal EnableDelayedExpansion
            echo(!line!
            endlocal & endlocal
         )
      )) > "%%~Nf.tmp"
   )
)

rem Update files
del *.xml
ren *.tmp *.xml

Previous solution assume that there are just two lines with <ID></ID> and <EpisodeNumber></EpisodeNumber> values placed in that order. If this is not true, a small modification is needed.

Aacini
  • 65,180
  • 12
  • 72
  • 108
  • to my experience, the`s` and `e` is sometimes captials, sometimes not. Therefore I would set `delims=sSeE.` – Stephan Nov 09 '14 at 18:51
  • Thanks for jumpin' in Aacini. The script you generously offered modifies the file (changes modified date) but it does not make any changes..? – Heather Brown Nov 09 '14 at 19:28
0

simple batch script:

@echo off

REM rename all files with matching patterns to tmp-files:
ren s??e??.xml *.tmp

REM for all tmp-files do:
for /f "tokens=*" %%f in ('dir /b *.tmp') do (
  REM get season and episode:
  for /f "tokens=1,2 delims=SsEe." %%i in ("%%~nf") do (
    REM write new xml file:
    >%%~dpnf.xml echo ^<ID^>%%i^</ID^>
    >>%%~dpnf.xml echo ^<EpisodeNumber^>%%j^</EpisodeNumber^>
  )
)
REM delete tmp files:
del *.tmp
Stephan
  • 53,940
  • 10
  • 58
  • 91
0

There is a very efficient and elegant solution using REPL.BAT - a hybrid JScript/batch utility that performs a regular expression search/replace on stdin and writes the result to stdout. REPL.BAT is pure script that will run natively on any Windows machine from XP onward. Full documentation is built into the script.

I use REPL.BAT twice. First to modify the output of DIR /B, filtering out lines that don't match the name template, and also extracting the Season and Episode values. The result is processed by FOR /F. Then for each file, a second REPL.BAT modifies the actual file and writes it to a temp file. Finally, the temp file is MOVEd to the original file name. The 2nd REPL makes both replacements in one pass. The replacement value is a JScript expression that determines which value to plug in, depending on the matched tag name.

This script will process all files in the current folder:

@echo off
for /f "delims=: tokens=1,2*" %%A in (
  'dir /b /a-d s??e*.xml^|repl "^s(\d\d)e(\d\d)" "$1:$2:$&" ia'
) do (
  type "%%C"|repl "(<(ID|EpisodeNumber)>).*?(</\2>)" "$1+($2=='ID'?'%%A':'%%B')+$3" j >"%%C.new"
  move /y "%%C.new" "%%C" >nul
)

This second version will process an entire folder hierarchy. It only requires a slight modification to the DIR command and the initial REPL search string:

for /f "delims=: tokens=1,2*" %%A in (
  'dir /b /s /a-d s??e*.xml^|repl "^.*\\s(\d\d)e(\d\d)" "$1:$2:$&" ia'
) do (
  type "%%C"|repl "(<(ID|EpisodeNumber)>).*?(</\2>)" "$1+($2=='ID'?'%%A':'%%B')+$3" j >"%%C.new"
  move /y "%%C.new" "%%C" >nul
)
Community
  • 1
  • 1
dbenham
  • 127,446
  • 28
  • 251
  • 390
  • Elegant indeed. This does the job exactly as needed, with or without the episode name. Also, it will input the correct data if something else is present. dbenham, thank you! ;) – Heather Brown Nov 10 '14 at 14:06