0

I have a csv file with the following data coming in the first column

B10114028000D5  0S 
C1                                  00000
D1 0000023426600   000
E1   0000000000 
F1       
G1     
B10119628000D5  0S2
C1                                  00000
D1 000000000000 
E1   0000000000  
F1

As you can see the data pattern, each group of data starts with B1, C1.....G1.

I have to pick only the selected data into a text file and the filter has to be applied only on the B1 column, based on which the whole group data must be picked.

Filter is B1 row before the space must be B10119628000D5. Output file should be

B10119628000D5  0S2
C1                                  00000
D1 000000000000 
E1   0000000000  
F1

What should be a convenient .bat file written? Please suggest.

user3219897
  • 161
  • 1
  • 1
  • 10
  • 1
    Do you want to find the line that start at a certain value (like `B10119628000D5`) and show the group of 5 lines starting at it? May be several groups that start with the same value? – Aacini Apr 16 '14 at 17:53

1 Answers1

1

Not sure if this is required, but I have developed solutions that allow for a matching group to appear multiple times within the input file. Each solution preserves all instances of the matching group.

For the code below, I assume the data is in "input.txt", and the output is to go in "output.txt"

Here is simple batch code that performs reasonably well for pure batch:

@echo off
setlocal disableDelayedExpansion
set "print="
(for /f "delims=" %%A in (input.txt) do (
  if defined print for /f "delims=1" %%B in ("%%A") do if "%%B" equ "B" set "print="
  if not defined print for /f %%B in ("%%A") do if "%%B" equ "B10119628000D5" set print=1
  if defined print echo %%A
)) >output.txt

The above may become quite slow if the file is very large.

I have written a hybrid JScript/batch utility called REPL.BAT that can be used to make an even simpler solution that is quite efficient. REPL.BAT is pure script that will run natively on any modern Windows machine from XP onward. Full documentation is embedded within the script.

I use REPL.BAT to encode newlines that do not precede "B1" as "@", thus turning a group of lines into one line. Then FINDSTR is used to preserve only the desired lines (matching "groups"), and a final REPL.BAT decodes the "@" back into newlines. If the data may contain "@", then substitue some other character that does not exist within the data.

type input.txt|repl \n(?!B1) @ m|findstr /bc:"B10119628000D5 "|repl @ \n x >output.txt

If you can't find a character that does not exist in the data, then "@" can be protected by an additional round of encode and decode:

type input.txt|repl @ @a|repl \n(?!B1) @n m|findstr /bc:"B10119628000D5 "|repl @n \n x|repl @a @ >output.txt



If the space is not required after the search string filter, as per comment, then the solutions change as follows:

option1:

@echo off
setlocal enableDelayedExpansion
set "print="
(for /f "delims=" %%A in (input.txt) do (
  set "ln=%%A"
  if defined print if "!ln:~0,2!" equ "B1" set "print="
  if not defined print if "!ln:~0,14!" equ "B10119628000D5" set print=1
  if defined print echo %%A
)) >output.txt

option 2:

type input.txt|repl \n(?!B1) @ m|findstr /b B10119628000D5|repl @ \n x >output.txt

option 3:

type input.txt|repl @ @a|repl \n(?!B1) @n m|findstr /b B10119628000D5|repl @n \n x|repl @a @ >output.txt
Community
  • 1
  • 1
dbenham
  • 127,446
  • 28
  • 251
  • 390
  • thanks but can you update the first logic to Filter data row when it starts with B10119628000D5? – user3219897 Apr 17 '14 at 09:53
  • Done - although the required change should be fairly obvious. – dbenham Apr 17 '14 at 11:47
  • The point was not to change the number but to apply the logic to check that the B1 columns starts with B10119628000D5 unlike before when it was set to pick up B10119628000D5 before the space. So the check of before space has to be removed. – user3219897 Apr 17 '14 at 12:44
  • @user3219897 - Ah, got it. I had inadvertantly dropped the space requirement from the original 2nd and 3rd options, so I modified them to require the space. I then added 3 variants at end where space is not required. – dbenham Apr 17 '14 at 13:52
  • not working, e.g. a Row which starts with B10119628000D5000 must also be included now – user3219897 Apr 17 '14 at 14:42
  • @user3219897 - It was a silly bug caused by careless cut and paste. I had an extra set of quotes in the comaparison that needed to be removed. All fixed. – dbenham Apr 17 '14 at 16:39