1

I have 1 csv file with 16 columns that looks like this :

WEB QUEST|Lazaro|Martinez|0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0

I´ve tried to merge 3 columns together with awk and sed but for some reason I still don get the desire output :

WEB QUEST|Lazaro Martinez 0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0

when I tried awk -F "|" '{print $1,"|"$2,$3,$4,"|"$5...}'

for some reason some spaces are added in each | and this is the output i get

EB QUEST |Maria Valencia Loza |Consultor de ventas | Mexico |DF | 55457110 | 55450327 | 4003071 | evalencia@webquest.com.mx | 05/10/1999 | 0 |0 |0

any ideas ?

oguz ismail
  • 1
  • 16
  • 47
  • 69

6 Answers6

4

The comma in Awk print adds a field separator OFS. To simply concatenate strings, omit the commas.

awk -F "|" '{print $1 "|" $2 $3 $4 "|" $5...}'

Probably a better approach is to set OFS="|" and simply shift out the third and fourth fields, as explained in Is there a way to completely delete fields in awk, so that extra delimiters do not print?

tripleee
  • 175,061
  • 34
  • 275
  • 318
2

With bash and GNU sed:

sed -e's/|/ /2'{,} file

Explanation:

-e's/|/ /2'{,}

expands to

-e's/|/ /2' -e's/|/ /2'

(see brace expansion); so it replaces second pipe with space twice, thus 2nd, 3rd and 4th fields will be merged.

oguz ismail
  • 1
  • 16
  • 47
  • 69
1

You can join column 2,3 and 4 like this, but it will leave you with empty column 3 and 4:

awk -F\| -v OFS='|' '{$2=$2" "$3" "$4;$3=$4="";print $0}' file
WEB QUEST|Lazaro Martinez 0|||Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0

It can be solved by printing the only fixed first fields, and the loop trough the rest.

awk -F\| '{a=$2" "$3" "$4;$3=$4="";printf "%s"FS"%s",$1,a;for (i=5;i<=NF;i++) printf FS"%s",$i}' file
WEB QUEST|Lazaro Martinez 0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0 

You can remove empty column like this, but if there are empty column in the original they are gone too:

awk -F\| -v OFS='|' '{$2=$2" "$3" "$4;$3=$4="";gsub(/[|]+/,FS)}1' file
WEB QUEST|Lazaro Martinez 0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0
Jotne
  • 40,548
  • 12
  • 51
  • 55
1

sed with matching the first four fields:

sed 's/\([^|]*\)|\([^|]*\)|\([^|]*\)|\([^|]*\)/\1|\2 \3 \4/' <<<"WEB QUEST|Lazaro|Martinez|0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0"

will output:

WEB QUEST|Lazaro Martinez 0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0
KamilCuk
  • 120,984
  • 8
  • 59
  • 111
0

Though your full requirement is not clear but by seeing your expected sample output I got to know you don't want space in expected output if this is the case then you need to set OFS as | as follows. Written and tested with GNU awk.

awk '
BEGIN{
  s1=" "
  FS=OFS="|"
  re="(.*)\\|\\|\\|(.*)"
}
prev{
  print gensub(re,"\\1|\\2","1",$0)
}
{
  $2=$2 s1 $3 s1 $4
  $3=$4=""
  prev=$0
}
END{
  if(prev){
    print gensub(re,"\\1|\\2","1",$0)
  }
}
'  Input_file
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
0

You said you tried with awk and sed; however you are not binded to those, you can use read.

Example:

#!/bin/bash
exec 3<file.csv
while IFS="|" read -r foo var1 var2 var3 bar <&3; do
  printf "%s|%s %s %s|%s\n" "${foo}" "${var1}" "${var2}" "${var3}" "${bar}"
done
exec 3>&-

Input:

WEB QUEST|Lazaro|Martinez|0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0

Ouptut:

WEB QUEST|Lazaro Martinez 0|Consultor de ventas|Mexico|DF|55457110|55450327|53445299|0|05/10/1999|0|0|0

N.B. The exec part is just there as "Best practice", you can achieve the same result without it

E.G.

#!/bin/bash
while IFS="|" read -r foo var1 var2 var3 bar; do
  printf "%s|%s %s %s|%s\n" "${foo}" "${var1}" "${var2}" "${var3}" "${bar}"
done <file.csv
ingroxd
  • 995
  • 1
  • 12
  • 28
  • 3
    `while read` is pathologically slow in Bash. You really want to prefer a proper tool like `sed` or Awk. – tripleee Oct 20 '19 at 08:59