114

How do I select the first column from the TAB separated string?

# echo "LOAD_SETTLED    LOAD_INIT       2011-01-13 03:50:01" | awk -F'\t' '{print $1}'

The above will return the entire line and not just "LOAD_SETTLED" as expected.

Update:

I need to change the third column in the tab separated values. The following does not work.

echo $line | awk 'BEGIN { -v var="$mycol_new" FS = "[ \t]+" } ; { print $1 $2 var $4 $5 $6 $7 $8 $9 }' >> /pdump/temp.txt

This however works as expected if the separator is comma instead of tab.

echo $line | awk -v var="$mycol_new" -F'\t' '{print $1 "," $2 "," var "," $4 "," $5 "," $6 "," $7 "," $8 "," $9 "}' >> /pdump/temp.txt
shantanuo
  • 31,689
  • 78
  • 245
  • 403
  • 4
    awk 'BEGIN { FS = "[ \t]+" } ; { print $1 }' # this is what I was looking for. Is my google search correct? :) – shantanuo Mar 21 '11 at 05:50
  • 3
    Thanks to this comment, I have discovered: `awk 'BEGIN {FS="\t"}; {print $1,FS,$2,FS,$3}' myFile.txt` to print tab-delimited values of the first three columns. – Wok May 30 '13 at 09:33
  • 7
    Or perhaps simply `awk 'BEGIN {OFS="\t"}; {print $1,$2,$3}' ` – Josiah Yoder Jul 22 '15 at 03:07
  • 5
    Both GNU and BSD awk support `-v` for setting variables. It's ugly to use `BEGIN {FS="\t"}` inside an *inline program*, and any open source contribution you try to make like that is likely to be objected to. Only do that if you are writing a *program file*. Also, it is discouraged to use `-F` instead of `-v FS=` because the latter makes clear that only `FS` is being set and not `OFS`. Confusion about that last point is what caused this post in the first place. That's why "good style" is important. – Bruno Bronosky Apr 12 '18 at 17:35
  • 2
    Please, no one, ever, should do what @Wok demonstrated. You don't enumerate [Input] Field Separators in your Output. You specify an Output Field Separator via the `OFS` variable. – Bruno Bronosky Apr 12 '18 at 17:38

8 Answers8

163

You need to set the OFS variable (output field separator) to be a tab:

echo "$line" | 
awk -v var="$mycol_new" -F'\t' 'BEGIN {OFS = FS} {$3 = var; print}'

(make sure you quote the $line variable in the echo statement)

glenn jackman
  • 238,783
  • 38
  • 220
  • 352
  • 7
    What is the purpose of the $ in $'\t'? – Amr Mostafa May 11 '13 at 11:55
  • 11
    Answering my own question from the [Advanced Bash Scripting Guide](http://tldp.org/LDP/abs/html/escapingsection.html): The $' ... ' quoted string-expansion construct is a mechanism that uses escaped octal or hex values ..., e.g., quote=$'\042'. – Amr Mostafa May 11 '13 at 12:34
  • 5
    @AmrMostafa, too bad that guide has a misleading explanation leading one to think that you don't the `$` in `$'\t'` is not needed. [Greg's wiki](http://mywiki.wooledge.org/Quotes) is better: "Of these, `$'...'` is the most common, and acts just like single quotes except that backslash-escaped combinations are expanded as specified by the ANSI C standard". – Cristian Ciupitu Jul 12 '14 at 20:38
  • 10
    In hindsight, the `$'\t'` is not necessary. awk understands the string `"\t"` to be a tab character – glenn jackman Oct 22 '15 at 11:50
  • `echo "LOAD_SETTLED LOAD_INIT 2011-01-13 03:50:01" | awk '{print $1}'` should this not be sufficient? – asadz Jan 11 '16 at 07:31
  • 10
    Open Source Contributors, I beg you, please don't submit stuff like `awk -F $'\t' 'BEGIN {OFS = FS} …'`. That should be `awk -v FS='\t' -v OFS='\t' '…'`. It may seem pedantic, but being inconsistent increases the chances that a later contributor will introduce a bug because they misunderstand your code. – Bruno Bronosky Apr 12 '18 at 17:44
  • This answer could use improvement. Lots of bash specific stuff going on, almost no explanation. It's not easy to distinguish what's the answer and what's unrelated – CervEd Jul 02 '21 at 12:39
23

Use:

awk -v FS='\t' -v OFS='\t' ...

Example from one of my scripts.

I use the FS and OFS variables to manipulate BIND zone files, which are tab delimited:

awk -v FS='\t' -v OFS='\t' \
    -v record_type=$record_type \
    -v hostname=$hostname \
    -v ip_address=$ip_address '
$1==hostname && $3==record_type {$4=ip_address}
{print}
' $zone_file > $temp

This is a clean and easy to read way to do this.

Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
Bruno Bronosky
  • 66,273
  • 12
  • 162
  • 149
23

Make sure they're really tabs! In bash, you can insert a tab using C-v TAB

$ echo "LOAD_SETTLED    LOAD_INIT       2011-01-13 03:50:01" | awk -F$'\t' '{print $1}'
LOAD_SETTLED
Mahmoud Abdelkader
  • 23,011
  • 5
  • 41
  • 54
18

You can set the Field Separator:

... | awk 'BEGIN {FS="\t"}; {print $1}'

Excellent read:

https://docs.freebsd.org/info/gawk/gawk.info.Field_Separators.html

John Kloian
  • 1,414
  • 15
  • 15
6
echo "LOAD_SETTLED    LOAD_INIT       2011-01-13 03:50:01" | awk -v var="test" 'BEGIN { FS = "[ \t]+" } ; { print $1 "\t" var "\t" $3 }'
shantanuo
  • 31,689
  • 78
  • 245
  • 403
1

If your fields are separated by tabs - this works for me in Linux.

awk -F'\t' '{print $1}' < tab_delimited_file.txt

I use this to process data generated by mysql, which generates tab-separated output in batch mode.

From awk man page:

   -F fs
   --field-separator fs
          Use fs for the input field separator (the value of the FS prede‐
          fined variable).
arainchi
  • 1,352
  • 13
  • 12
0
  • 1st column only

    awk NF=1 FS='\t'

 LOAD_SETTLED
  • First 3 columns

    awk NF=3 FS='\t' OFS='\t'

 LOAD_SETTLED    LOAD_INIT    2011-01-13
  • Except first 2 columns

    {g,n}awk NF=NF OFS= FS='^([^\t]+\t){2}'

    {m}awk NF=NF OFS= FS='^[^\t]+\t[^\t]+\t'

 2011-01-13    03:50:01
  • Last column only

    awk '($!NF=$NF)^_' FS='\t', or

    awk NF=NF OFS= FS='^.*\t'

 03:50:01
RARE Kpop Manifesto
  • 2,453
  • 3
  • 11
-3

Should this not work?

echo "LOAD_SETTLED    LOAD_INIT       2011-01-13 03:50:01" | awk '{print $1}'
Tim
  • 41,901
  • 18
  • 127
  • 145
asadz
  • 174
  • 15