1

I was wondering how can I get the names of the fruits in from this .csv file by using awk or some other cli tool.

I used a macro in vim to edit the file, but I would think that there is an easy one liner that would do the same.

fruits.csv:

"1000","Apple","4","133"
"1028","Lemon","3","120"
"1029","Lime","3","165"
"1030","Lychee","6","120"
"1031","Mango","6","131"
"1032","Mangostine","1","181"
"1033","Melon","4","159"
"1034","Cantaloupe","4","138"
"1035","Honeydew melon","4","155"
"1036","Watermelon","5","176"
"1037","Rock melon","2","180"
"1038","Nectarine","1","128"
"1039","Orange","6","142"
"1040","Peach","6","179"
"1041","Pear","3","102"
"1042","Williams pear or Bartlett pear","1","164"
"1043","Pitaya","2","170"
"1044","Physalis","5","166"
"1045","Plum/prune (dried plum)","4","103"
"1046","Pineapple","3","120"
"1047","Pomegranate","5","112"
"1048","Raisin","4","111"
"1049","Raspberry","5","156"
"1050","Western raspberry (blackcap)","6","173"

The final result that I would want would look like this:

Apple
Lemon
Lime
Lychee
Mango
Mangostine
Melon
Cantaloupe
Honeydew melon
Watermelon
Rock melon
Nectarine
Orange
Peach
Pear
Williams pear or Bartlett pear
Pitaya
Physalis
Plum/prune (dried plum)
Pineapple
Pomegranate
Raisin
Raspberry
Western raspberry (blackcap)

I realize that this is a duplicate:

What's the most robust way to efficiently parse CSV using awk?

How to parse a CSV in a Bash script?

  • 1
    Using the presented duplicate, you quicly come too the answer : `awk -v FPAT='[^,]*|"[^"]+"' '{print $2}' file.csv` – kvantour Feb 03 '21 at 21:30
  • Here are some other methods too: Method 1: `grep -o "[a-zA-Z() ]*" fruits.csv` to mach all desired characters, Method 2: `cut -d"," -f2 fruits.csv | sed 's/"//g'` define a delimeter `-d","`, choose an index `-f2`, remove quotes with sed. Method 3: `sed 's/[^,]*,//;s/,.*//;s/"//g' fruits.csv` remove everything up through the first comma, remove everything after and including the second comma, remove quotes – Barak Binyamin Feb 04 '21 at 15:12

5 Answers5

3

I suggest:

awk -F '","' '{print $2}' file

Use "," as field separator and output second column.

Cyrus
  • 84,225
  • 14
  • 89
  • 153
1

Using combination of sed and awk

sed -e 's/^"//;s/","/\t/g;s/"//g' Input.csv| awk -F'\t' '{print$2}'

or

awk -F, '{print$2}' Input.csv | sed 's/"//g'

Both can print each column via changing the awk column number.

Ravi Saroch
  • 934
  • 2
  • 13
  • 28
0

Use this Perl one-liner:

perl -F',' -lane '$F[1] =~ tr/"//d; print $F[1];' in_file > out_file 

The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-n : Loop over the input one line at a time, assigning it to $_ by default.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
-a : Split $_ into array @F on whitespace or on the regex specified in -F option.
-F',' : Split into @F on comma, rather than on whitespace.

SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches

Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47
0

GNU awk and gensub():

$ gawk '{print gensub(/^[^,]*,"|([^,])".*/,"\\1","g")}' file

Output

Apple
...
Lemon
Lime
James Brown
  • 36,089
  • 7
  • 43
  • 59
0

With awk removing all " only in the second field and only at the begining and at the end of the second field.


awk -F',' '{gsub(/^"|"$/,"",$2);print $2}' file
Apple
Lemon
Lime
Lychee
Mango
Mangostine
Melon
Cantaloupe
Honeydew melon
Watermelon
Rock melon
Nectarine
Orange
Peach
Pear
Williams pear or Bartlett pear
Pitaya
Physalis
Plum/prune (dried plum)
Pineapple
Pomegranate
Raisin
Raspberry
Western raspberry (blackcap)
Carlos Pascual
  • 1,106
  • 1
  • 5
  • 8