1

I adapted this script from another thread on Stack Overflow. Script works, but has incorrect output because of special characters (<, >, ", =) in the search query.

Basically, I just need to find <script src="https://d1tdp7z6w94jbb.cloudfront.net/js/jquery-3.3.1.min.js" type="text/javascript" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script> and remove it.

setlocal EnableExtensions DisableDelayedExpansion

set "search=<script src="https://d1tdp7z6w94jbb.cloudfront.net/js/jquery-3.3.1.min.js" type="text/javascript" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script>"
set "replace="

set "textFile=index.html"
set "rootDir=."

for %%j in ("%rootDir%\%textFile%") do (
    for /f "delims=" %%i in ('type "%%~j" ^& break ^> "%%~j"') do (
        set "line=%%i"
        setlocal EnableDelayedExpansion
        set "line=!line:%search%=%replace%!"
        >>"%%~j" echo(!line!
        endlocal
    )
)
endlocal

I have found other threads on Stack Overflow asking the same question, but I can't understand their implementations and how to apply them to this script.

aschipfl
  • 33,626
  • 12
  • 54
  • 99
  • 1
    This is the original script: https://stackoverflow.com/questions/41542554/batch-script-to-replace-specific-string-in-multiple-files – Dax Liniere Jul 02 '19 at 23:56
  • 2
    My tests are showing that the `<` and `>` are breaking things, not the quotes. What is the exact error you're getting? – SomethingDark Jul 03 '19 at 02:22
  • Is it really necessary to use batch for this task? There's a lot of other tools which will do it much faster and better. – montonero Jul 03 '19 at 07:48
  • Thanks @SomethingDark, really appreciate you letting me know. The problem is that the search criteria is showing up in the output files. I assumed this to be because of the " but thankyou for clarifying. – Dax Liniere Jul 04 '19 at 14:12
  • @montonero, it's for ease of use (no host required to run the script) and future configurability without requiring re-coding. Also, I'm not a coder, but am familiar with .BAT since you had to load MS-DOS from 3.5". :) – Dax Liniere Jul 04 '19 at 14:14
  • I just realised that the major problem are the `=`-signs, that's why I deleted my (apparently wrong) [answer](https://stackoverflow.com/a/56890348); finally I found [the other question](https://stackoverflow.com/q/37724410) dealing with exactly the same problem, which I once provided a solution for... – aschipfl Jul 10 '19 at 22:47
  • Thank you again @aschipfl. I tried the method you linked (repl-str.bat) and didn't have any luck when I ran this from inside another BAT file: ````repl-str.bat "index.php" "" "" "index.php"```` It gives the error "cscript.exe is not recognized.." I even added the IF NOT EXIST lines which pass correctly. – Dax Liniere Jul 14 '19 at 10:38
  • Well, are you sure you copied the most recent version of the script `repl-str.bat` correctly? I cannot reproduce that, and where should `cscript.exe` come from, which is never called within my script? could you (temporarily) change `@echo off` to `@echo on` and check what line is causing the failure? and what is the exact encoding of your input file `index.php`? – aschipfl Jul 15 '19 at 21:26

1 Answers1

0

Windows command processor cmd.exe is designed for executing commands and applications. It is not designed for file content modification purposes independent on type of file.

There are lots of script interpreters which have built-in support for modification of file contents like VBScript, JScript, PowerShell, Perl, Python, ... So best would be to use a different script interpreter than Windows command processor for this task, especially on search or replace string contain "<=>| which makes a file content modification with pure Windows command processor commands a nightmare.

However, this is an easy to achieve task with using JREPL.BAT written by Dave Benham which is a batch file / JScript hybrid to run a regular expression replace on a file using JScript.

@echo off
if not exist ".\index.html" goto :EOF
if not exist "%~dp0jrepl.bat" goto :EOF

call "%~dp0jrepl.bat" "[\t ]*<script src=\x22https://d1tdp7z6w94jbb.cloudfront.net/js/jquery-3.3.1.min.js\x22 type=\x22text/javascript\x22 integrity=\x22sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=\x22 crossorigin=\x22anonymous\x22></script>[\t ]*\r?\n?" "" /M /F ".\index.html" /O -

The batch file first checks if there is an index.html file in current directory and immediately exits if this condition is not true, see Where does GOTO :EOF return to?

The batch file JREPL.BAT must be stored in same directory as the batch file with the code above. For that reason the batch file checks next if JREPL.BAT really exists in directory of the batch file and exits if this condition is not true.

Next the batch file calls JREPL.BAT to do a case-sensitive regular expression replace with replace string being an empty string.

The search string is mainly the string which should be removed from the file.

Each " in search string is replaced by \x22 which is an expression to search for a character with hexadecimal code value 22 which is the code value of character " to be able to specify this string on Windows command line as one argument string enclosed in double quotes.

The main search string does not contain any character with a special regular expression meaning and therefore no other character must be escaped with a backslash to be interpreted as literal character by regular expression function of JScript.

The main search string also does not contain any character with a special Windows command processor meaning even inside a double quoted argument string like percent sign %. Each % inside the searched string would be needed to be escaped with one more % to be interpreted as literal character by cmd.exe parsing this command line before calling the other batch file with the already parsed arguments.

The search expression starts with [\t ]* to remove additionally 0 or more horizontal tabs or normal spaces left to the string to remove. The string to remove is usually in an HTML file on a separate line indented with tabs or spaces and the goal is to remove also those indenting whitespaces.

The search expression ends with [\t ]*\r?\n? to remove additionally 0 or more horizontal tabs or normal spaces right to the string to remove, i.e. trailing whitespaces on the line, and one carriage return if existing at all, and one line-feed if existing at all.

So an entire line is removed from the file if the string to remove is on a separate line in the HTML file without or with leading tabs/spaces and without or with trailing tabs/spaces. But if the string to remove is on a line with other HTML tags, just the searched string and the tabs/spaces left and right to this string are removed from the HTML file. The JREPL.BAT option /M is used to be able to remove an entire line and not only the searched string within the line and leaving back an empty line on script tags being on a separate line.

For understanding the used commands and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.

  • call /? ... explains also %~dp0 ... drive and path of argument 0 being the batch file itself.
  • echo /?
  • goto /?
  • if /?
  • jrepl.bat /?
Mofi
  • 46,139
  • 17
  • 80
  • 143
  • WOW @Mofi, this is really amazing. The [\t ]* part was especially clever, great thinking! JREPL.BAT looks great. (Have you seen that the help pages are more than half the filesize?! XD I couldn't see any switches in there to recurse subdirectories. /s is standard for a few command line programs, but doesn't seem to be there in JREPL. Any thoughts on this? I have some other strings that need to be found & replaced, too, so recurse would be necessary. – Dax Liniere Jul 04 '19 at 22:15