1

I have a txt file with this format:
some text another uninteresting line // some more lines can come here [ actually interesting // this is the stuff I want ]

I want to be able to get everything between the square brackets [] (including the brackets themselves).

(since I know that there's no text after the closing bracket, it's enough to be able to delete just the first lines before the [ char).
I'm pretty sure I can do it with findStr, but not sure exactly how.

J. Ed
  • 6,692
  • 4
  • 39
  • 55
  • running a loop on the chars and filling a buffer each time `[` is encountered, exporting it on `]`, is the most flexible approach – aelgoa Mar 31 '14 at 08:04

4 Answers4

1

You can use VBScript. Save the following as extract.vbs

flag=0
Do While Not WScript.StdIn.AtEndOfStream
   Line  = WScript.StdIn.ReadLine()
   If Left(Line,1)="[" Then flag=1 End If
   If flag=1 Then
      WScript.Stdout.WriteLine(Line)
   End If
Loop

Then run

CSCRIPT /NOLOGO EXTRACT.VBS < YOURFILE

It sets a flag to zero, then reads the input file one line at a time till the end. If it encounters a line starting with "[" it sets flag=1. Then it prints every line it finds when flag is set to 1.

If you want to save the lines it finds, in a new file, run it like this:

CSCRIPT /NOLOGO EXTRACT.VBS < YOURFILE > NEWFILE
Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
1

FINDSTR cannot solve this on its own.

Given your situation that you can simply delete all lines before the line that starts with [, all you need is the following native batch script.

@echo off
setlocal
for /f "delims=:" %%N in ('findstr /n [ "file.txt"') do if not defined N set /a N=%%N-1
set "skip="
if %N% gtr 1 set "skip=skip=%N%"
(for /f "usebackq %skip% delims=" %%A in ("file.txt") do echo %%A) >"newFile.txt"

If you know that your file does not contain tabs, or if it is OK that tabs are converted to a string of spaces, then it is even easier:

@echo off
setlocal
for /f "delims=:" %%N in ('findstr /n [ "file.txt"') do if not defined N set /a N=%%N-1
more +%N% "file.txt" >"newFile.txt"

The solution is a one liner if you use REPL.BAT - a hybrid JScript/batch utility that performs regular expression search and replace on stdin and writes the result to std out. It is pure script that will run natively on any modern Windows machine from XP onward.

Assuming that [ only appears once, then:

type "file.txt" | repl "[^[]*\[" "[" m >"newFile.txt"

It is even simple to support multiple blocks between square brackets where the [ and/or ] could be in the middle of a line:

type "file.txt" | repl "[^[]*(\[[\s\S]*?\])[^[]*" "$1\r\n" mx >"newFile.txt"
Community
  • 1
  • 1
dbenham
  • 127,446
  • 28
  • 251
  • 390
0
@echo off
    setlocal enableextensions disabledelayedexpansion

    set "dataFile=data.txt"

rem search the starting line
    set "startLine="
    for /f "tokens=1 delims=:" %%a in (
        'findstr /l /b /n /c:"[" "%dataFile%"'
    ) do if not defined startLine set "startLine=%%a"

rem remove all lines before the starting one    
    if defined startLine for /f "tokens=1,* delims=:" %%a in (
        'findstr /n "^" "%dataFile%" ^& break ^> "%dataFile%"'
    ) do if %%a geq %startLine% >>"%dataFile%" echo(%%b

    endlocal
MC ND
  • 69,615
  • 8
  • 84
  • 126
0

If you install some tools from a proper Operating System (Unix/Linux) you can do it without any code:

grep -A 999 \[ yourfile

That says look for a [ character in yourfile and print it and up to 999 lines after (-A) it. Unix Utils are available for free here.

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432