2

I need to parse, iterate through an .tsv file, using awk.

The file path is correct, tested in terminal. getting error "cat: ./datalist.tsv No such file or directory"

the tsv file have few rows, tab separated. plan is to loop through the tsv file content.

here is my code, for filename.awk :

Ikhtiar
  • 53
  • 2
  • 8
  • 4
    please add sample input and corresponding expected output for that.. and try to explain on what basis conversion is done.. – Sundeep Mar 07 '18 at 09:49
  • Possible duplicate of [Tab separated values in awk](https://stackoverflow.com/questions/5374239/tab-separated-values-in-awk) – tripleee Mar 08 '18 at 05:47
  • 1
    *"cat: ./datalist.tsv No such file or directory"* looks very much like the path actually **isn't** correct. – tripleee Mar 08 '18 at 05:53
  • sorry, but later found that the file path was wrong. thanks for your answer – Ikhtiar Mar 14 '18 at 02:13

2 Answers2

1

You don't have to use cat to read a TSV file. Instead, just read the file directly.

For example:

#!/bin/gawk
BEGIN {
    FS = "\t"
    OFS = ","
    ORS = "\r\n"
    while (( getline < "datalist.tsv" ) > 0) {
        print $1,$2,$3
    }
}

Example input (all spaces between the fields are just a single tab):

1   2   3
ab  bc  cd
abc bcd cde

Example output:

1,2,3
ab,bc,cd
abc,bcd,cde

NOTE: if the fields inside your TSV file never have spaces in them, like in my example input, you don't even need to change field separator FS="\t" as the fields will be split on spaces as well as tabs by default.

Andriy Makukha
  • 7,580
  • 1
  • 38
  • 49
0

You are enormously overcomplicating things. Why are you doing the reading in the BEGIN block and why do you set OFS to something else than the output separator you apparently actually want?

awk 'BEGIN { FS="\t"; OFS="_"; ORS="\"\r\n" } { print $1, $2, $3 }' ./datalist.tsv

If the file is properly TSV there are a few quirks you may need to work around. The format allows for a field to contain the delimiter if it is inside double quotes; apparently, the file you are reading does have double quotes (why else do you put " in the ORS?) so a complete solution would parse the quotes and ignore the field separator if it's inside a pair of (unescaped!) quotes. (See e.g. this question.)

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • 1
    I don't think proper TSV allows the delimiter inside a field. At least, not in IANA standard: "Note that fields that contain tabs are not allowable in this encoding." http://www.iana.org/assignments/media-types/text/tab-separated-values – Andriy Makukha Mar 08 '18 at 06:06