1

I was looking two merge several csv files into one. However I need them merging so that the columns are side by side not continuing on. My files have two columns each time and I would have one csv files by extracting the 2nd column on each files and copying into another one so I will have a file with x columns (the second one).

e.g.

File1

A B
1 2
1 2
1 2

File2

A C
1 3
1 3
1 3

Filex

A X
1 x
1 x
1 x

Result

B C X
2 3 x
2 3 x
2 3 x

I found this : Merge csv file side by side using batch file

but it's just for two files and it doesn't extract anything.

Thanks.

Community
  • 1
  • 1

3 Answers3

3

All the solutions below assume all input files are formatted consistently and have the same number of lines.

The technique used at Merge csv file side by side using batch file can be modified to parse and support more than 2 files. I also toggle delayed expansion on and off within the loop to protect any ! that may appear in the data. FOR variables containing ! are corrupted if they are expanded while delayed expansion is enabled:

@echo off
setlocal disableDelayedExpansion
3<"test2.txt" 4<"test3.txt" (
  for /f "usebackq tokens=2 delims= " %%A in ("test1.txt") do (
    set "A=%%A"
    set /p "B=" <&3
    set /p "C=" <&4
    setlocal enableDelayedExpansion
    echo !A! !B:* =! !C:* =!
    endlocal
  )
) >"result.txt"

The above can be extended to support up to 9 input files using the FOR loop plus handles 0, and 3-9. If you have more than 8 inputs, then you need multiple loops. The first loop can process the first 9 files and write the partial result to a temporary file. Successive loops can read from the temporary file and merge up to 8 additional files.

The above may become cumbersome if your parsing rules become more complex.

My JREPL.BAT hybrid JScript/batch utility can be used to efficiently parse and merge any number of files, and you can modify the regular expressions as needed to parse nearly any csv file format.

@echo off
setlocal
set "merge=jrepl ".*( .*)" "stdin.ReadLine()+$1" /j /f"
jrepl ".* (.*)" "$1" /f test1.txt | %merge% test2.txt | %merge% test3.txt >result.txt

In theory you could use any number of pipes to support all your input files, but if you get too many, it will probably become inefficient. You could use temporary files to stage the merge to preserve efficiency.

Community
  • 1
  • 1
dbenham
  • 127,446
  • 28
  • 251
  • 390
3

The method originally described at this post can be modified in order to process a variable number of files (maximum 8), so you just need to put the desired files in the parameter:

@echo off
setlocal DisableDelayedExpansion

rem MergeFiles.bat: Merge several files horizontally
rem Antonio Perez Ayala

rem Process files in the arguments and 
rem assemble the lists of redirections and SET /P commands
set file1=%1
set "redirs="
set "commands="
set n=2
:nextFile
   shift
   if "%~1" equ "" goto endFiles
   set /A n+=1
   set "redirs=%redirs% %n%<%1"
   set "commands=%commands% & set /P "part=!part:* =! " <&%n%"
goto nextFile
:endFiles

rem First file is read with FOR /F command
rem The rest of files are read via standard handles, starting at # 3

%redirs% (
   for /F "usebackq delims=" %%a in (%file1%) do (
      rem Get first part from first file
      set "part=%%a"
      rem Output parts from all files, excepting the last one
      setlocal EnableDelayedExpansion
      %commands:~3%
      rem Output part from last file
      echo !part:* =!
      endlocal
   )
) > result.txt

For example:

C:\> type file?.txt

file1.txt

A B
1 2
1 2
1 2

file2.txt

A C
1 3
1 3
1 3

fileX.txt

A X
1 x
1 x
1 x

fileY.txt

A Y
1 y
1 y
1 y

C:\> MergeFiles.bat file1.txt file2.txt fileX.txt fileY.txt

C:\> type result.txt
B C X Y
2 3 x y
2 3 x y
2 3 x y
Community
  • 1
  • 1
Aacini
  • 65,180
  • 12
  • 72
  • 108
2
@ECHO OFF
SETLOCAL enabledelayedexpansion
:: remove variables starting $
FOR  /F "delims==" %%a In ('set $ 2^>Nul') DO SET "%%a="
SET /a $count=0
FOR %%a IN (q28850167*.txt) DO (
 FOR  /f "tokens=1,2*delims=: " %%b IN ('findstr /n /r "^" "%%a"') DO (
  SET $%%b=!$%%b! %%d
  IF !$count! LSS %%b SET /a $count=%%b
 )
)
(
FOR /L %%a IN (1,1,%$count%) DO ECHO(!$%%a:~1!
)>newfile.txt


GOTO :EOF

I used files named q28850167*.txt containing your data for my testing.
Produces newfile.txt

Clear all variables starting $

For each file matching the mask, process each line through findstr/n to apply number : to the start of each line. Select the first and third+ columns using : and Space as separators and append to variable $linenumber Keep track of the highest line number in $count.

Then simply reproduce each line from the stored data.

Magoo
  • 77,302
  • 8
  • 62
  • 84
  • 1
    The file names may not sort in the correct column order, in which case you can explicitly list each file in the correct order within the outer FOR loop - `for .... in ( file1 file2 file3 ) do (` – dbenham Mar 04 '15 at 15:15