Merge 3 columns into 1

Question

I have 1 csv file with 16 columns that looks like this :

WEB QUEST|Lazaro|Martinez|0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0

I´ve tried to merge 3 columns together with awk and sed but for some reason I still don get the desire output :

WEB QUEST|Lazaro Martinez 0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0

when I tried awk -F "|" '{print $1,"|"$2,$3,$4,"|"$5...}'

for some reason some spaces are added in each | and this is the output i get

EB QUEST |Maria Valencia Loza |Consultor de ventas | Mexico |DF | 55457110 | 55450327 | 4003071 | evalencia@webquest.com.mx | 05/10/1999 | 0 |0 |0

any ideas ?

Which are your desired output? "," may is not necessary. – Yuji Oct 20 '19 at 04:49 — Yuji, Oct 20 '19 at 04:49

tripleee · Answer 1 · 2019-10-20T09:07:30.490

4

The comma in Awk print adds a field separator OFS. To simply concatenate strings, omit the commas.

awk -F "|" '{print $1 "|" $2 $3 $4 "|" $5...}'

Probably a better approach is to set OFS="|" and simply shift out the third and fourth fields, as explained in Is there a way to completely delete fields in awk, so that extra delimiters do not print?

edited Oct 20 '19 at 09:07

answered Oct 20 '19 at 09:02

tripleee

175,061
34
275
318

oguz ismail · Answer 2 · 2019-10-20T13:01:42.353

2

With bash and GNU sed:

sed -e's/|/ /2'{,} file

Explanation:

-e's/|/ /2'{,}

expands to

-e's/|/ /2' -e's/|/ /2'

(see brace expansion); so it replaces second pipe with space twice, thus 2nd, 3rd and 4th fields will be merged.

edited Oct 20 '19 at 13:01

answered Oct 20 '19 at 09:11

oguz ismail

1
16
47
69

Jotne · Answer 3 · 2019-10-20T07:22:02.223

You can join column 2,3 and 4 like this, but it will leave you with empty column 3 and 4:

awk -F\| -v OFS='|' '{$2=$2" "$3" "$4;$3=$4="";print $0}' file
WEB QUEST|Lazaro Martinez 0|||Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0

It can be solved by printing the only fixed first fields, and the loop trough the rest.

awk -F\| '{a=$2" "$3" "$4;$3=$4="";printf "%s"FS"%s",$1,a;for (i=5;i<=NF;i++) printf FS"%s",$i}' file
WEB QUEST|Lazaro Martinez 0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0

You can remove empty column like this, but if there are empty column in the original they are gone too:

awk -F\| -v OFS='|' '{$2=$2" "$3" "$4;$3=$4="";gsub(/[|]+/,FS)}1' file
WEB QUEST|Lazaro Martinez 0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0

score 1 · Accepted Answer · answered Oct 20 '19 at 09:39

sed with matching the first four fields:

sed 's/\([^|]*\)|\([^|]*\)|\([^|]*\)|\([^|]*\)/\1|\2 \3 \4/' <<<"WEB QUEST|Lazaro|Martinez|0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0"

will output:

WEB QUEST|Lazaro Martinez 0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0

RavinderSingh13 · Answer 5 · 2019-10-20T06:24:11.363

Though your full requirement is not clear but by seeing your expected sample output I got to know you don't want space in expected output if this is the case then you need to set OFS as | as follows. Written and tested with GNU awk.

awk '
BEGIN{
  s1=" "
  FS=OFS="|"
  re="(.*)\\|\\|\\|(.*)"
}
prev{
  print gensub(re,"\\1|\\2","1",$0)
}
{
  $2=$2 s1 $3 s1 $4
  $3=$4=""
  prev=$0
}
END{
  if(prev){
    print gensub(re,"\\1|\\2","1",$0)
  }
}
'  Input_file

score 0 · Answer 6 · answered Oct 20 '19 at 08:49

You said you tried with awk and sed; however you are not binded to those, you can use read.

Example:

#!/bin/bash
exec 3<file.csv
while IFS="|" read -r foo var1 var2 var3 bar <&3; do
  printf "%s|%s %s %s|%s\n" "${foo}" "${var1}" "${var2}" "${var3}" "${bar}"
done
exec 3>&-

Input:

WEB QUEST|Lazaro|Martinez|0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0

Ouptut:

WEB QUEST|Lazaro Martinez 0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0

N.B. The exec part is just there as "Best practice", you can achieve the same result without it

E.G.

#!/bin/bash
while IFS="|" read -r foo var1 var2 var3 bar; do
  printf "%s|%s %s %s|%s\n" "${foo}" "${var1}" "${var2}" "${var3}" "${bar}"
done <file.csv

`while read` is pathologically slow in Bash. You really want to prefer a proper tool like `sed` or Awk. — tripleee, Oct 20 '19 at 08:59

Merge 3 columns into 1

6 Answers6