11

I have a file containing some data and I want to use only the first column as a stdin for my script, but I'm having trouble extracting it. I tried using this

awk -F"\t" '{print $1}' inputs.tsv

but it only shows the first letter of the first column. I tried some other things but it either shows the entire file or just the first letter of the first column.

My file looks something like this:

Harry_Potter    1
Lord_of_the_rings    10
Shameless    23
....
codeforester
  • 39,467
  • 16
  • 112
  • 140
Saeko
  • 421
  • 1
  • 4
  • 14

2 Answers2

20

You can use cut which is available on all Unix and Linux systems:

cut -f1 inputs.tsv

You don't need to specify the -d option because tab is the default delimiter. From man cut:

 -d delim
         Use delim as the field delimiter character instead of the tab character.

As Benjamin has rightly stated, your awk command is indeed correct. Shell passes literal \t as the argument and awk does interpret it as a tab, while other commands like cut may not.

Not sure why you are getting just the first character as the output.


You may want to take a look at this post:

codeforester
  • 39,467
  • 16
  • 112
  • 140
  • 2
    I don't think you're correct in saying that `"\t"` doesn't translate to a tab. The shell won't touch it in double quotes, and awk then *does* interpret `\t` as a tab. Additionally, `$'\t'` is Bash only. I'm pretty sure that `awk -F"\t" '{print $1}'` is a POSIX compliant way of printing the first tab separated field of each line. Example: `awk --posix -F"\t" '{print $1}' <<< $'1\t2'` – Benjamin W. Mar 17 '18 at 20:07
  • 1
    Thanks @BenjaminW. for correcting me. I have updated the answer to include your explanation. – codeforester Mar 17 '18 at 20:47
  • 2
    @BenjaminW. is correct and `awk -F'\t'` is fine, you do not need the bashism of `awk -F$'\t'`. The different between single and double quotes is also not relevant in this case. – Ed Morton Mar 17 '18 at 21:13
2

Try this (better rely on a real parser...):

csvcut -c 1 -f $'\t' file

Check csvkit

Output :

Harry_Potter
Lord_of_the_rings
Shameless

Note :

As @RomanPerekhrest said, you should fix your broken sample input (we saw spaces where tabs are expected...)

Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223