3

Is there a simple and good-performing way to read two (or even more) text files line by line in parallel? So to have a loop that reads a single line of each text file in every iteration.

A for /F loop with multiple files given cannot be used as this reads a file after another. Nested such loops do not make sense either of course.

aschipfl
  • 33,626
  • 12
  • 54
  • 99

1 Answers1

6

The trick is to use STDIN redirection < (see also this site) using undefined handles (3 to 9) for the entire code block for file reading, the command set /P in the block to actually read a line and 0<& to redirect the undefined handles back to STDIN for set /P, thus for the respective line to read.

Here is an example how it works:

Supposing there are the following two text files names.txt...:

Black
Blue
Green
Aqua
Red
Purple
Yellow
White
Grey
Brown

...and values.txt...:

0
1
2
3
4
5
6
7

...and the goal is to combine them line by line to achieve this file, names=values.txt...:

Black=0
Blue=1
Green=2
Aqua=3
Red=4
Purple=5
Yellow=6
White=7

...the following code accomplishes that (see all the explanatory comments, rem):

@echo off
setlocal EnableExtensions EnableDelayedExpansion

rem // Define constants here:
set "FILE1=names.txt"
set "FILE2=values.txt"
set "RET=names=values.txt" & rem // (none to output to console)
if not defined RET set "RET=con"

rem /* Count number of lines of 1st file (2nd file is not checked);
rem    this is necessary to know when to stop reading: */
for /F %%C in ('^< "%FILE1%" find /C /V ""') do set "NUM1=%%C"

rem /* Here input redirection is used, each file gets its individual
rem    (undefined) handle (that is not used by the system) which is later
rem    redirected to handle `0`, `STDIN`, in the parenthesised block;
rem    so the 1st file data stream is redirected to handle `4` and the
rem    2nd file to handle `3`; within the block, as soon as a line is read
rem    by `set /P` from a data stream, the respective handle is redirected
rem    back to `0`, `STDIN`, where `set /P` expects its input data: */
4< "%FILE1%" 3< "%FILE2%" > "%RET%" (
     rem // Loop through the number of lines of the 1st file:
     for /L %%I in (1,1,%NUM1%) do (
         set "LINE1=" & rem /* (clear variable to maintain empty lines;
                        rem     `set /P` does not change variable value
                        rem     in case nothing is entered/redirected) */
         rem // Change handle of 1st file back to `STDIN` and read line:
         0<&4 set /P "LINE1="
         set "LINE2=" & rem // (clear variable to maintain empty lines)
         rem // Change handle of 2nd file back to `STDIN` and read line:
         0<&3 set /P "LINE2="
         rem /* Return combined pair of lines (only if line of 2nd file is
         rem    not empty as `set /P` sets `ErrorLevel` on empty input): */
         if not ErrorLevel 1 echo(!LINE1!=!LINE2!
     )
)

endlocal
exit /B
aschipfl
  • 33,626
  • 12
  • 54
  • 99
  • 3
    Yes, this method was already used [here](http://stackoverflow.com/questions/32738831/extracting-all-lines-from-multiples-files/32739680#32739680), or [here](http://stackoverflow.com/questions/14521799/combinining-multiple-text-files-into-one/14523100#14523100), or [here](http://stackoverflow.com/questions/28850167/solved-merge-several-csv-file-side-by-side-using-batch-file/28864990#28864990), or [here](http://stackoverflow.com/questions/32238565/windows-batch-file-combine-csv-in-a-folder-by-column/32254700#32254700), or [here](http://www.dostips.com/forum/viewtopic.php?f=3&t=3126)... – Aacini Jul 06 '16 at 05:33
  • @Aacini, thanks for the links! it seems that I used the wrong search terms here (_parallel_, _simultaneous_, _concurrent_,...); _combining_ files is just used as an example here... – aschipfl Jul 06 '16 at 19:56