0

I have a script that extracts lines such as :

THIS_IS_A_LINE:=

THIS_IS_A_LINE2:=

and outputs all of the same kind into another .txt file as:

THIS_IS_A_LINE

THIS_IS_A_LINE2

The script is the following:

set "file=%cd%/Config.mak"
set /a i=0
set "regexp=.*:=$"
setlocal enableDelayedExpansion
IF EXIST Source_List.txt del /F Source_List.txt
for /f "usebackq delims=" %%a in ("%file%") do (
    set /a i+=1
    call set Feature[!i!]=%%a
) 
cd .. && cd ..
rem call echo.!Feature[%i%]!
for /L %%N in (1,1,%i%) do (
    echo(!Feature[%%N]!|findstr /R /C:"%regexp%" >nul && (
        call echo FOUND
        call set /a j+=1
        call set Feature_Disabled[%j%]=!Feature[%%N]:~0,-2!
        call echo.!Feature_Disabled[%j%]!>>Source_List.txt
    ) || (
        call echo NOT FOUND 
    )  
) 
endlocal

I also have another script that extracts lines such as:

THIS_IS_ANOTHER_LINE:=true

THIS_IS_ANOTHER_LINE2:=true ...

and outputs all of the same kind into another .txt file as:

THIS_IS_ANOTHER_LINE

THIS_IS_ANOTHER_LINE2 ...

The script is the following:

set "file=%cd%/Config.mak"
set /a i=0
set "regexp=.*:=true$"
setlocal enableDelayedExpansion
IF EXIST Source_List2.txt del /F Source_List2.txt
for /f "usebackq delims=" %%a in ("%file%") do (
    set /a i+=1
    call set Feature[!i!]=%%a
) 
cd .. && cd ..
rem call echo.!Feature[%i%]!
for /L %%N in (1,1,%i%) do (
    echo(!Feature[%%N]!|findstr /R /C:"%regexp%" >nul && (
        call echo FOUND
        call set /a j+=1
        call set Feature_Disabled[%j%]=!Feature[%%N]:~0,-6!
        call echo.!Feature_Disabled[%j%]!>>Source_List2.txt
    ) || (
        call echo NOT FOUND 
    )  
) 
endlocal

Nevertheless, there is a third kind of lines which contain numerical numbers (also some hexadecimal values), such as:

THIS_IS_AN_UNPROCESSED_LINE:=0xA303

THIS_IS_AN_UNPROCESSED_LINE2:=1943

THIS_IS_AN_UNPROCESSED_LINE3:=HELLO_DOOD_CAN_YOU_PARSE_ME?

So I need the way to extract as well those kind of lines into another .txt file such as:

THIS_IS_AN_UNPROCESSED_LINE:=0xA303

THIS_IS_AN_UNPROCESSED_LINE2:=1943

THIS_IS_AN_UNPROCESSED_LINE3:=HELLO_DOOD_CAN_YOU_PARSE_ME?

So basically extract lines which are not of the kind:

THIS_IS_AN_UNPROCESSED_LINE:=

or

THIS_IS_AN_UNPROCESSED_LINE:=true

but keeping both the sides of the line entry.

I know there must be some trick with the regular expression but I just can't find it out.

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
Jackson
  • 101
  • 10

1 Answers1

0

You have made your code much more complicated than it needs to be. There is no need to create an array of every line in the file.

If there are no other : or = before the first :=, then you can use FINDSTR to print out all lines that contain a string, followed by :=. FOR /F can capture and parse each matching line into the parts before and after :=, and then IF statements can classify the three different types of lines.

I use n> to open all three output files outside the main code block for improved performance, and then I use the &n> syntax to direct each output to the appropriate, already opened file. I use high numbered file handles to avoid problems described at Why doesn't my stderr redirection end after command finishes? And how do I fix it?.

@echo off
setlocal
set "file=Config.mak"
set /a "empty=7, true=8, unprocessed=9"
%empty%>empty.txt %true%>true.txt %unprocessed%>unprocessed.txt (
  for /f "delims=:= tokens=1*" %%A in ('findstr /r "^[^:=][^:=]*:=" "%file%"') do (
    if "%%B" equ "" (
      >&%empty% (echo %%A)
    ) else if "%%B" equ "true" (
      >&%true% (echo %%A)
    ) else (
      >&%unprocessed% (echo %%A:=%%B)
    )
  )
)

The above will ignore lines that contain : or = before :=, and it will not work properly if the first character after := is : or =. I'm assuming that should not be a problem.

It should be relatively easy to write a very efficient solution using PowerShell, VBScript, or JScript that eliminates the limitations.

You could also use JREPL.BAT - a powerful and efficient regular expression text processing command line utility. JREPL.BAT is pure script (hybrid batch/JScrpt) that runs natively on any Windows machine from XP onward, no 3rd party exe required. And JREPL is much faster than any pure batch solution, especially if the files are large.

Here is one JREPL solution

@echo off
setlocal
set repl=^
 $txt=false;^
 if ($2=='') stdout.WriteLine($1);^
 else if ($2=='true') stderr.WriteLine($1);^
 else $txt=$0;

call jrepl "^(.+):=(.*)$" "%repl%" /jmatchq^
     /f Config.mak /o unprocessed.txt >empty.txt 2>true.txt

If all you have to do is classify the lines into three different files, without worrying about stripping off the :=true and := parts for the empty and true lines, then there is a very simple pure batch solution using nothing but FINDSTR.

@echo off
set "file=Config.mak"
findstr /r ".:=$" "%file%" >empty.txt
findstr /r ".:=true$" "%file%" >true.txt
findstr /r ".:=" "%file%" | findstr /r /v ":=$ :=true$" >unprocessed.txt
Community
  • 1
  • 1
dbenham
  • 127,446
  • 28
  • 251
  • 390