6

I have a text file that is one long string like this:

ISA*00*GARBAGE~ST*TEST*TEST~CLP*TEST~ST*TEST*TEST~CLP*TEST~ST*TEST*TEST~CLP*TEST~GE*GARBAGE*~   

And I need it to look like this:

~ST*TEST*TEST~CLP*TEST
~ST*TEST*TEST~CLP*TEST
~ST*TEST*TEST~CLP*TEST

I first tried to add a line at every ~ST to split the string up, but I can't for the life of me make this happen. I have tried various scripts, but I thought a find/replace script would work best.

@echo off
setlocal enabledelayedexpansion
set INTEXTFILE=test.txt
set OUTTEXTFILE=test_out.txt
set SEARCHTEXT=~ST
set REPLACETEXT=~ST

for /f "tokens=1,* delims=~" %%A in ( '"type %INTEXTFILE%"') do (
    SET string=%%A
    SET modified=!string:%SEARCHTEXT%=%REPLACETEXT%!

    echo !modified! >> %OUTTEXTFILE%
)
del %INTEXTFILE%
rename %OUTTEXTFILE% %INTEXTFILE%

Found here How to replace substrings in windows batch file

But I'm stuck because (1) the special character ~ makes the code not work at all. It gives me this result:

string:~ST=~ST

The code does nothing at all if using quotes around "~ST". And (2) I can't figure out how to add a line break before ~ST.

The final task for this would be to delete the ISA*00*blahblahblah and ~GE*blahblahblah lines after all splits have been performed. But I am stuck on the splitting at ~ST part.

Any suggestions?

Community
  • 1
  • 1
AnA
  • 61
  • 4
  • 1
    There is no easy way to replace a tilde with batch. JREPL (batch/JScript hybrid) is a good solution – jeb Dec 07 '15 at 10:01
  • What's the criteria to identify the initial and final parts that should be removed? is the initial part just everything before the first occurrence of `~ST`, and the final part `~GE` and everything after? and what size is your input text file? – aschipfl Dec 07 '15 at 10:42

3 Answers3

3
@echo off
setlocal EnableDelayedExpansion

rem Set next variable to the number of "~" chars that delimit the wanted fields, or more
set "maxTokens=7"
rem Define the delimiters that starts a new field
set "delims=/ST/GE/"

for /F "delims=" %%a in (test.txt) do (
   set "line=%%a"
   set "field="
   rem Process up to maxTokens per line;
   rem this is a trick to avoid a call to a subroutine that have a goto loop
   for /L %%i in (0,1,%maxTokens%) do if defined line (
      for /F "tokens=1* delims=~" %%b in ("!line!") do (
         rem Get the first token in the line separated by "~" delimiter
         set "token=%%b"
         rem ... and update the rest of the line
         set "line=%%c"
         rem Get the first two chars after "~" token like "ST", "CL" or "GE";
         rem                            if they are "ST" or "GE":
         for %%d in ("!token:~0,2!") do if "!delims:/%%~d/=!" neq "%delims%" (
            rem Start a new field: show previous one, if any
            if defined field echo !field!
            if "%%~d" equ "ST" (
               set "field=~%%b"
            ) else (
               rem It is "GE": cancel rest of line
               set "line="
            )
         ) else (
            rem It is "CL" token: join it to current field, if any
            if defined field set "field=!field!~%%b"
         )
      )
   )
)

Input:

ISA*00*GARBAGE~ST*TEST1*TEST1~CLP*TEST1~ST*TEST2*TEST2~CLP*TEST2~ST*TEST3*TEST3~CLP*TEST3~GE*GARBAGE*~CLP~TESTX

Output:

~ST*TEST1*TEST1~CLP*TEST1
~ST*TEST2*TEST2~CLP*TEST2
~ST*TEST3*TEST3~CLP*TEST3
Aacini
  • 65,180
  • 12
  • 72
  • 108
  • Someone told me, that a good answer should contain also a good explanation. :-) – jeb Dec 07 '15 at 19:07
0

The ~ cannot be used as the first character of a search string in the substring substitution syntax %VARIABLE:SEARCH_STRING=REPLACE_STRING%, because it is used to mark the substring expansion %VARIABLE:~POSITION,LENGTH% (type set/? for more information).

Supposing your text file contains a single line of text only and it does not exceed a size of about 8 kBytes, I see the following option for accomplishing your task. This script makes use of the substring substitution syntax %VARIABLE:*SEARCH_STRING=REPLACE_STRING%; the * defines to match everything up to the first occurrence of SEARCH_STRING:

@echo off
setlocal EnableExtensions EnableDelayedExpansion

rem initialise constants:
set "INFILE=test_in.txt"
set "OUTFILE=test_out.txt"
set "SEARCH=ST"
set "TAIL=GE"

rem read single-line file content into variable:
< "%INFILE%" set /P "DATA="
rem remove everything before first `~%SEARCH%`:
set "DATA=~%SEARCH%!DATA:*~%SEARCH%=!"

rem call sub-routine, redirect its output:
> "%OUTFILE%" call :LOOP

endlocal
goto :EOF

:LOOP
rem extract portion right to first `~%SEARCH%`:
set "RIGHT=!DATA:*~%SEARCH%=!"
rem skip rest if no match found:
if "!RIGHT!"=="!DATA!" goto :TAIL
rem extract portion left to first `~%SEARCH%`, including `~`:
set "LEFT=!DATA:%SEARCH%%RIGHT%=!"
rem the last character must be a `~`;
rem so remove it; `echo` outputs a trailing line-break;
rem the `if` avoids an empty line at the beginning;
rem the unwanted part at the beginning is removed implicitly:
if not "!LEFT:~,-1!"=="" echo(!LEFT:~,-1!
rem output `~%SEARCH%` without trailing line-break:
< nul set /P "DUMMY=~%SEARCH%"
rem store remainder for next iteration:
set "DATA=!RIGHT!"
rem loop back if remainder is not empty:
if not "!DATA!"=="" goto :LOOP
:TAIL
rem this section removes the part starting at `~%TAIL%`:
set "RIGHT=!DATA:*~%TAIL%=!"
if "!RIGHT!"=="!DATA!" goto :EOF
set "LEFT=!DATA:%TAIL%%RIGHT%=!"
rem output part before `~%TAIL%` without trailing line-break:
< nul set /P "DUMMY=!LEFT:~,-1!"
goto :EOF

The following restrictions apply to this approach:

  • the input file contains a single line;
  • the size of the input file does not exceed about 8 kBytes;
  • there is exactly one instance of ~GE, that occurs after all instances of ~ST;
  • there is always at least one character in between two adjacent ~ST instances;
  • no special characters occur in the file, like: SPACE, TAB, ", %, !, =;
aschipfl
  • 33,626
  • 12
  • 54
  • 99
0

Don't reinvent the wheel, use a regexp replace tool such as sed or JREPL.BAT:

call jrepl "^.*?~ST(.+?)~GE.*$" "'~ST'+$1.replace(/~ST/g,'\r\n$&')" /jmatch <in.txt >out.txt
wOxxOm
  • 65,848
  • 11
  • 132
  • 136