3

I have some data in a text file (test.txt), reading:

wantedunwanteddata

I want to remove the "unwanted" part in that string and output the rest (i.e. "wanteddata" in another file (test2.txt). I'm using:

findstr /v "unwanted" test.txt>test2.txt

however that's returning an empty file.

user3552829
  • 103
  • 1
  • 4
  • 13

1 Answers1

7

The reason why findstr /v "unwanted" test.txt>test2.txt won't work is because findstr searches for lines matching the conditions you gave it. findstr will not return substrings matching the conditions but the whole line where the conditions are met. In findstr /v "unwanted" test.txt>test2.txt you're asking for all lines in test.txt without "unwanted" in it. That's why test2.txt is empty: there are no such lines.

In batch, you can replace all occurences of a substring from a value of a variable with the following syntax: %var:substr=repl%. This will replace all occurences of substr with repl in the string %var% contains. As removing substring is similar to replacing with an empty string (at least in this context), you can use %var:substr=% to remove all occurences of a substr.

If you want to remove all occurences of a substring in a file, you can read each line of that file in a variable with for /f and print out that variable after removing the substring from it. Be aware that as we will have to create a variable inside a for /f-block and use it inside that same block, delayed expansion will be needed (this answer explains why).

@echo off
SetLocal EnableDelayedExpansion

set input=text1.txt
set output=text2.txt
set "substr=unwanted"

(
    FOR /F "usebackq delims=" %%G IN ("%input%") DO (
        set line=%%G
        echo. !line:%substr%=!
    )
) > "%output%"

EndLocal
exit /b 0

I've set (paths to) your inputfile text1.txt and your outputfile text2.txt in variables (respectively input and output) without surrounding quotes (quotes are added when variables are used). That will make it easier to change them if needed.
The extra (..) surrounding the for /f is just for handling the output redirect to the outputfile.
In case you don't want to use delayed expansion, you can omit the SetLocal EnableDelayedExpansion and the EndLocal and replace echo !line:%substr%=! with call echo %%line:%substr%=%% inside the for /f.

EDIT: If your input file contains special characters like <>()|&%, you must use delayed expansion. With the normal variable expansion used in call echo %%line:%substr%=%% those special characters will be processed with their special meanings by the cmd-interpreter (< and > for input or output redirection for example) and generate unexpected results.
Also I've surrounded the assignment of the substr variable but if the substring you're trying to replace contains special characters like <>()|&% each of them must also be escaped in order for %substr% to work as expected. You can escape a special character with a caret-sign ^, except for the % that must be doubled (%% instead of %).

EDIT2: for /f skips blank lines, so if one wants to keep those blank lines in the output file, some workarounds will be required. A common hack in pure batch to cope with that is to use findstr /n or find /n to prepend each line (including the empty ones) with their line number while feeding the inputfile to the for /f. This will of course require some extra processing to cope with the line numbers inside the for /f block and remove them from the output of the for /f but it is possible. This answer to a similar question provides an excellent explanation for those workarounds and their drawbacks.

J.Baoby
  • 2,167
  • 2
  • 11
  • 17
  • Thanks, @J.Baoby! This works perfectly. One more question - what if I my input contained html tags, so like: datadata (all on a single line) and I wanted to remove the data bit. I've tried the above (replacing the substr with ^data^ but returns an empty file? – user3552829 Feb 14 '17 at 13:09
  • it worked on my side on a Windows 7 machine by replacing `set substr=unwanted` with `set "substr=^data^"` (if possible, keep using the variable `substr` instead of directly replacing `substr` inside the `for /f`). Because of the special characters `<>`, the assignment of the `substr` variable must be surrounded with double quotes. Besides that, if your inputfile contains special characters, you'll be obliged to use delayed expansion. You've helped me remember a little detail I forgot: the `EndLocal` at the end of the script (even though it's omission should not be a problem). – J.Baoby Feb 14 '17 at 15:13
  • I've made an edit to my post (added a remark). Hope this helps for your html-tags – J.Baoby Feb 14 '17 at 15:33
  • Works a treat! Thanks. ... as long as it is OK for blank lines to be removed. – Jesse Chisholm Jun 12 '18 at 00:46
  • @JesseChisholm Unfortunately, the `for /f` command skips the empty lines. That's how it has been designed. There exist some (hacky) workarounds using `findstr` or `find` to prepend each line (including the empty ones) with their line number and other characters before executing the `for /f` but they will require some extra processing of the lines. See [this question](https://stackoverflow.com/questions/38723595/preserve-empty-lines-in-a-text-file-while-using-batch-for-f) for example. – J.Baoby Jun 12 '18 at 06:53
  • This is exactly the functionality I was looking for except that I'm getting an extra space character appearing at the beginning of lines that remove the string from the start of line. My search string line is: set "substr=11/28/2019 " NOTE: I have a space character inside the quotes as part of the string to remove which I thought gets rid of the space after the string. But It seems to be replacing my search string with a space instead of deleting. – tzg Jul 05 '22 at 16:06