1

I have a *.csv file. with value as below

"ASDP02","8801942183589"

"ASDP06","8801939151023"

"CSDP04","8801963981740"

"ASDP09","8801946305047"

"ASDP12","8801941195677"

"ASDP05","8801922826186"

"CSDP08","8801983008938"

"ASDP04","8801944346555"

"CSDP11","8801910831518"

or sometimes the value is as below

"8801989353984","KSDP05" 

"8801957608165","ASDP11" 

"8801991455848","CSDP10" 

"8801981363116","CSDP07" 

"8801921247870","KSDP07" 

"8801965386240","CSDP06" 

"8801956293036","KSDP10" 

"8801984383904","KSDP11" 

"8801944211742","ASDP09" 

I just want to put the numeric value (e.g. 8801989353984) always in 1st column. Is it possible using BASH script?

Ruslan Osmanov
  • 20,486
  • 7
  • 46
  • 60
  • 2
    Generally you shouldn't parse CSV in Bash, as it is difficult to write a good parser in Bash for the general case where the cells may contain newlines, for instance. – Ruslan Osmanov Dec 16 '16 at 17:20

5 Answers5

1

Sed is also your friend here

Input

cat 41189347
"ASDP02","8801942183589"
"ASDP06","8801939151023"
"CSDP04","8801963981740"
"ASDP09","8801946305047"
"ASDP12","8801941195677"
"ASDP05","8801922826186"
"CSDP08","8801983008938"
"ASDP04","8801944346555"
"CSDP11","8801910831518"

Script

sed -E 's/^("[[:alpha:]]+.*"),("[[:digit:]]+")$/\2,\1/' 41189347

Output

"8801942183589","ASDP02"
"8801939151023","ASDP06"
"8801963981740","CSDP04"
"8801946305047","ASDP09"
"8801941195677","ASDP12"
"8801922826186","ASDP05"
"8801983008938","CSDP08"
"8801944346555","ASDP04"
"8801910831518","CSDP11"
sjsam
  • 21,411
  • 5
  • 55
  • 102
1

awk to the rescue!

$ awk -F, -v OFS=, '$1~/[A-Z]/{t=$2;$2=$1;$1=t}1' file

if first field has alpha chars, swap first and second columns and print.

karakfa
  • 66,216
  • 7
  • 41
  • 56
0

Bash can do the work but awk might be a better choice for rearrange your file:

sample.csv:

"ASDP02","8801942183589"
"8801944211742","ASDP09"

command:

awk -F, 'BEGIN{OFS=","}{$1=$1;if(substr($1, 2, length($1) - 2) + 0 == substr($1, 2, length($1) - 2)){print $1,$2}else{print $2,$1}}' sample.csv
  • substr($1, 2, length($1) - 2) + 0 == substr($1, 2, length($1) - 2) checks the column is numeric or not. If it is, print the original line otherwise switch column1 and column2

Output:

"8801942183589","ASDP02"
"8801944211742","ASDP09"
Haifeng Zhang
  • 30,077
  • 19
  • 81
  • 125
  • I just want to rearrange the column with numeric value in 1st column wherever the column is in the csv file – Kazi Nymul Haque Kanon Dec 16 '16 at 17:29
  • @KaziNymulHaqueKanon check it now. I thought they are in the different files – Haifeng Zhang Dec 16 '16 at 17:33
  • Suppose there is 3 columns and i want to get the same output as before. "KSDP10","20","1910427539" "KSDP10","20","1910427642" "KSDP10","20","1910427666" "KSDP10","20","1910427704" "KSDP10","20","1910427720" "KSDP10","20","1910427738" "KSDP10","20","1910427936" "KSDP10","20","1910428023" "KSDP10","20","1910428046" – Kazi Nymul Haque Kanon Dec 16 '16 at 19:17
0

The following commands assume that the cells in the CSV files do not contain newlines and commas. Otherwise, you should write a more complicated script in Perl, PHP, or other programming language capable of parsing CSV files properly. But Bash, definitely, is not appropriate for this task.


Perl

perl -F, -nle '@F = reverse @F if $F[0] =~ /^"\d+"$/;
print join(",", @F)' file

Beware, If the cells contain newlines, or commas, use Perl's Text::CSV module, for instance. Although it is a simple task in Perl, it goes beyond the scope of the current question.

The command splits the input lines by commas (-F,) and stores the result into @F array, for each line. The items in the array are reversed, if the first field $F[0] matches the regular expression. You can also swap the items this way: ($F[0], $F[1]) = ($F[1], $F[0]).

Finally, the joins the array items with commas, and prints to the standard output.

If you want to edit the file in-place, use -i option: perl -i.backup -F, ....

AWK

awk -F, -vOFS=, '/^"[0-9]+",/ {print; next}
{ t = $1; $1 = $2; $2 = t; print }' file

The input and output field separators are set to , with -F, and -vOFS=,.

If the line matches the pattern /^"[0-9]+",/ (the line begins with a "numeric" CSV column), the script prints the record and advances to the next record. Otherwise the next block is executed.

In the next block, it swaps the first two columns and prints the result to the standard output.

If you want to edit the file in-place, see answers to this question.

Community
  • 1
  • 1
Ruslan Osmanov
  • 20,486
  • 7
  • 46
  • 60
0

You can create a pure bash script to generate other file which has the structure you need:

#!/bin/bash

csv_file="/path/to/your/csvfile"
output_file="/path/to/output_file"

#Optional
rm -rf "${output_file}"

readarray -t LINES < <(cat < "${csv_file}" 2> /dev/null)

for item in "${LINES[@]}"; do

  if [[ $item =~ ^\"([0-9A-Z]+)\"\,\"([0-9]+)\" ]]; then
    echo "\"${BASH_REMATCH[2]}\",\"${BASH_REMATCH[1]}\"" >> "${output_file}"
  else
    echo "$item" >> "${output_file}"
  fi
done

This works even if your file is "mixed" I mean with some lines in the right format and other lines in the bad format.

OscarAkaElvis
  • 5,384
  • 4
  • 27
  • 51