126

I am constantly learning new tools, even old fashioned ones, because I like to use the right solution for the problem.

Nevertheless, I wonder if there is still any reason to learn some of them. awk for example is interesting to me, but for simple text processing, I can use grep, cut, sed, etc. while for complex ones, I'll go for Python.

Now I don't mean that's it's not a powerful and handy tool. But since it takes time and energy to learn a new tool, is it worth it ?

k0pernikus
  • 60,309
  • 67
  • 216
  • 347
Bite code
  • 578,959
  • 113
  • 301
  • 329
  • 9
    it's 2019 and I just rewrote a python log normalizer in awk. I ran the transform on a week of log files that were 54 million line log file (~9GB). On my Macbook Pro (2.8 GHZ i7, 16 GB Ram), the python version could process ~10k lines a second => 90 minutes of runtime. Using mawk, the run time reduced to 2 minutes. Btw, the awk program was half the size. – mistahenry Aug 16 '19 at 10:14
  • @mistahenry did you try pypy? – qwr Feb 24 '20 at 21:14
  • `RS` and `FS` (`-F`) being regular expressions in AWK. It's mind-blowing, once you figure out what that makes possible, especially for quick one-liners. Even `perlvar` has to [give it props](https://perldoc.perl.org/perlvar#$INPUT_RECORD_SEPARATOR). Also, the pattern-action paradigm means not having to `with open('filename', 'r') as f:`, `for line in f.readlines():`, and so on, so it cuts down on a _ton_ of file-handling code and nested conditionals, relative to Python especially. Re-write `| awk '{$5<0.05}'` in Python and compare, if you don't believe me. ;) – TheDudeAbides Jul 22 '23 at 13:50

26 Answers26

121

If you quickly learn the basics of awk, you can indeed do amazing things on the command line.

But the real reason to learn awk is to have an excuse to read the superb book The AWK Programming Language by Aho, Kernighan, and Weinberger.

The AWK Programming Language at archive.org

You would think, from the name, that it simply teaches you awk. Actually, that is just the beginning. Launching into the vast array of problems that can be tackled once one is using a concise scripting language that makes string manipulation easy — and awk was one of the first — it proceeds to teach the reader how to implement a database, a parser, an interpreter, and (if memory serves me) a compiler for a small project-specific computer language! If only they had also programmed an example operating system using awk, the book would have been a fairly complete survey introduction to computer science!

Famously clear and concise, like the original C Language book, it also is a wonderful example of friendly technical writing done right. Even the index is a piece of craftsmanship.

Awk? If you know it, you'll use it at the command-line occasionally, but for anything larger you'll feel trapped, unable to access the wider features of your system and the Internet that something like Python provides access to. But the book? You'll always be glad you read it!

Brandon Rhodes
  • 83,755
  • 16
  • 106
  • 147
  • 6
    +1 Sold. I am going to order this book. I have used awk for years as a quick and powerful one-liner scripting language. Awk is a great pre-processor for files that would otherwise take a dozen lines to code. I cannot count how many times I have used the form: awk '{print $1, $2}' – galaxywatcher Jan 09 '10 at 09:21
  • 2
    Agreed. It almost defies belief how compact that book is given all that it contains. It covers more than most contemporary books in 1/10(?) the length. – clay Jun 25 '11 at 06:01
  • 3
    I am reading this book now and it has inflamed my enthusiasm for awk to a near obsession. – galaxywatcher Jan 28 '12 at 09:17
  • 3
    See also the excellent [Gawk: Effective AWK Programming](http://www.gnu.org/software/gawk/manual/). – lhf Sep 28 '12 at 13:04
  • 1
    I just read the first chapter. It is amazing. Mistery resolved. – vaichidrewar Sep 20 '15 at 21:04
  • How can I get a hold of that book? Seems difficult :/ (The AWK Programming Language) – Moberg Jan 21 '21 at 12:33
  • I just checked several popular used book sites, and many copies are available. Where did you look? (I would be more specific and helpful and name the sites I checked, but I'm not sure if I'm supposed to be advertising for specific book sites here in a comment?) – Brandon Rhodes Jan 22 '21 at 15:00
  • Ordered a copy off eBay just now! I will read anything with Kernighan as an author – BytePorter Jun 13 '21 at 15:06
  • 1
    The book can be downloaded from [The Internet Archive](https://ia903404.us.archive.org/0/items/pdfy-MgN0H1joIoDVoIC7/The_AWK_Programming_Language.pdf) – SSteve May 22 '22 at 16:26
105

I think it depends on the environment you find yourself in. If you are a *nix person, then knowing awk is a Good Thing. The only other scripting environment that can be found on virtually every *nix is sh. So while grep, sed, etc can surely replace awk on a modern mainstream linux distro, when you move to more exotic systems, knowing a little awk is going to be Real Handy.

awk can also be used for more than just text processing. For example one of my supervisors writes astronomy code in awk - that is how utterly old school and awesome he is. Back in his days, it was the best tool for the job... and now even though his students like me use python and what not, he sticks to what he knows and works well.

In closing, there is a lot of old code kicking around the world, knowing a little awk isn't going to hurt. It will also make you better *nix person :-)

freespace
  • 16,529
  • 4
  • 36
  • 58
  • 12
    ++ Agreed, awk really is one of the most portable, and importantly, consistent tools in the *nix toolset. It works reliably on busybox, for instance, where perl is nowhere to be found. – guns Mar 31 '09 at 21:49
  • 1
    And it's really not that hard to learn either if you're used to curly brace languages – guns Mar 31 '09 at 21:50
  • It's the same in any environment, unlike Regex which has a different syntax in every program (and two very different syntaxes in Visual C# Studio)... – Mark K Cowan Jun 02 '14 at 23:48
  • 2
    "It's the same in any environment" - not quite: under Windows single quotes have to be replaced with double,s and internal doubles have to be escaped. (Windows is kind of a real environment, even if exposing yourself to Redmond's insecure half-finished atrocimacy puts you at the mercy of any Russian 15 year old). – GT. Jan 04 '15 at 23:26
  • 7
    I don't think many people associate the existence of awk and windows in the same universe.....:P – FoldedChromatin Oct 02 '15 at 13:25
  • 2
    Still using awk for text processing jobs. I will often start a script in something else (ruby, python) and end up going back to awk for the simplicity and power. – Rumbleweed Jun 17 '16 at 17:01
  • what is a *nix person? – Student May 11 '19 at 16:01
32

The only reason I use awk is the auto-splitting:

awk '{print $3}' < file.in

This prints the third whitespace-delimited field in file.in. It's a bit easier than:

tr -s ' ' < file.in | cut -d' ' -f3
Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
27

I think awk is great if your file contains columns/fields. I use it when processing/analyzing a particular column in a multicolumn file. Or if I want to add/delete a particular column(s).

e.g.

awk -F \t '{ if ($2 > $3) print; }' <filename>

will print only if the 2nd column value in a tab seperated file is greater than the 3rd column value.

Of course I could use Perl or Python, but awk makes it so much simpler with a concise single line command.

Also learning awk is pretty low-cost. You can learn awk basics in less than an hour, so it's not as much effort as learning any other programming/scripting language.

Nikhil
  • 2,028
  • 7
  • 24
  • 33
10

6 years after asking this question I can now answer with certainty: no, learning awk is not worth it.

Basic tasks are handled by basic bash commands, or even GUI tools easily. More complex tasks will be easily tackled with modern dynamic languages such as Python (fav or mine) or Ruby.

You should definitely learn a modern scripting dynamic language as it will help you in so many tasks (web, admin, data crunching, automation, etc). And by doing so, learning a tool such as awk is completely useless, it will save you at best a few seconds every month.

sigjuice
  • 28,661
  • 12
  • 68
  • 93
Bite code
  • 578,959
  • 113
  • 301
  • 329
  • 3
    Not necessarily true. If you're parsing really large files, it could be much faster than other tools. – user1071847 Oct 03 '18 at 12:39
  • 2
    Interesting because a few years after this you are still asking questions about awk. I was one of the original responders and still use it with some regularity to this day – Dexygen Dec 29 '19 at 10:34
  • 1
    I still write scripts for DNA sequencing analysis using awk. If I need to do statistics, I switch to R (with processed data files). Haven't really felt the need to use python. Perhaps it is my reluctance to switch (and comfort with awk) but I started using awk in the first place because python was taking ages to parse something that happened within minutes in awk. – WYSIWYG Jan 18 '22 at 18:53
9

I do use awk every so often. It's good for very simple text shuffling in the middle of a pipeline; it fills a very narrow niche right between not needing it at all and needing to whip out Perl/Python/whatever.

I wouldn't advise you spend a whole lot of time on it, but it might come in handy to know the basics of the syntax -- at least enough that you can consult the manual quickly should you ever want to use it.

Eevee
  • 47,412
  • 11
  • 95
  • 127
9

I use AWK occasionally for dealing with HTML. For instance, this code translates tables to csv files:

BEGIN {s=""; FS="n"}
/<td/ { gsub(/<[^>]*>/, ""); s=(s ", " $1);}
/<tr|<TR/ { print s; s="" }

Which is great if you're screen scraping. Actually, it might be the case that I love AWK because it allows me to build the wrong solution to problems so quickly :) more examples. It's also mentioned in Jon Bentley's lovely Programming Pearls.

Dave
  • 917
  • 1
  • 8
  • 20
8

Learning AWK was invaluable for me in my last contract working on an embedded Linux system on which neither Perl nor most other scripting languages were installed.

Dexygen
  • 12,287
  • 13
  • 80
  • 147
6

If you already know and use sed, you might as well pick up at least a bit of awk. They can be piped together for some pretty powerful tricks. Always impresses the audience.

Internet Friend
  • 1,082
  • 7
  • 10
6

Most awk one liners can be achieved with Perl one liners - if you choose to get into a Perl one liner mindset. Or, just use Perl three liners :)

If you're maintaining shell scripts written by someone who liked awk, then clearly, you're going to need to learn awk.

Even if there's no practical need, if you already know regex it won't take long to pick up the basics, and it's fun to see how things were designed back then. It's rather elegant.

slim
  • 40,215
  • 13
  • 94
  • 127
5

Computerworld recently did an interview with Alfred V. Aho (one of the three creators of AWK) about AWK. It's a quite interesting read. So maybe you'll find some hints in it, why it's a good idea learn AWK.

dlat
  • 149
  • 1
  • 4
  • Nice, but did not convinced me. AWK is a very good tool, but I think I will never need it enough to take the time to learn it instead of hacking my solution in sed or python. – Bite code Sep 26 '08 at 12:56
3

It's useful mostly if you have to occasionally parse log files for data or output of programs while shell scripting, because it's very easy to achieve in awk that that would take you a little more lines of code in python.

It certainly has more power than that, but this seems to be tasks most people use it for.

NeuroSys
  • 41
  • 4
3

awk has a very good ratio utility/difficulty, and "simple awk" works in every Unix/Linux/MacOS (and it can be installed in other systems too).

It was designed in Golden Age when people hated typing, so scripts can be very, very short and fast to write. I will try to instal mawk, a fast version, allegedly it accelerates the computation about 9 times, awk/gawk is rather slow, so if you want to use it instead of R etc. you may want mawk.

2

Nope.

Even though it might be interesting, you can do everything that awk can do using other, more powerful tools such as Perl.

Spend your time learning those more powerful tools - and only incidentally pick up some awk along the way.

Ed Guiness
  • 34,602
  • 16
  • 110
  • 145
2

Of course: I'm working in an environment where the only available languages are: (some shity language which generates COBOL, OMG, OMG), bash (old version), perl (I don't master it yet), sed, awk, and some other command line utilities. Knowing awk saved me several hours (and had generated several text processing tasks from my collegaues - they come to me at least three times a day).

Zsolt Botykai
  • 50,406
  • 14
  • 85
  • 110
1

I'd say it's probably not worth it anymore. I use it from time to time as a much more versatile stream editor than sed with searching abilities included, but if you are proficient with python I do not know a task which you would be able to finish that much faster to compensate for the time needed to learn awk.

The following command is probably the only one for which I've used awk in the last two years (it purges half-removed packages from my Debian/Ubuntu systems):

$ dpkg -l|awk '/^rc/ {print $2}'|xargs sudo dpkg -P
Matthias Kestenholz
  • 3,300
  • 1
  • 21
  • 26
1

I was recently trying to visualize network pcap files logging a DOS attack which amounted to over 20Gbs. I needed the timestamp and the Ip addresses. In my scenario, AWK one-liner worked fabulously and pretty fast as well. I specifically used AWK to clean the extracted files, get the ip addresses and the total packet count from those IP addresses within grouped span of time. I totally agree with what other people have written above. It depends on your needs.

Ash Catchem
  • 911
  • 11
  • 13
1

I'd say there is. For simple stuff, AWK is a lot easier on the inexperienced sysadmin / developer than Python. You can learn a little AWK and do a lot of things, learning Python means learning a whole new language (yes, I know AWK is a language is a sense too).

Perl might be able to do a lot of things AWK can do, but offered the choice in this day and age I would choose Python here. So yes, you should learn AWK. but learn Python too :-)

wzzrd
  • 610
  • 4
  • 13
1

awk is a powertool language, so you are likely going to find awk being used somewhere if you are an IT professional of any sort. If you can handle the syntax and regular expressions of grep and sed then you should have no problem picking up awk and it is probably worthwhile to.

Where I've found awk really shine is in simplifying things like processing multi-line records and mangling/interpolating multiple files simultaneously.

1

One reason NOT to learn awk is that it doesn't have non-greedy matches in regular expressions.

I have an awk code that now I must rewrite only because I suddenly debugged that there is no such thing as non-greedy matches in awk/gawk thus it can't properly execute some regexes.

user619271
  • 4,766
  • 5
  • 30
  • 35
1

It depends on your team mates and you leader and the task you are working on.

if( team mates and leader ask to write awk ){
  if( you can reject that){
    if( awk code is very small){
      learn little just like learn Regex
    }else{
      use python or even java
    }
  }else{
    do as they ask
  }
}
Kenneth
  • 403
  • 4
  • 12
0

Now that PERL is ported to pretty much every significant platform, I'd say it's not worth it. It's more versatile than sed and awk together. As for auto-splitting, you can do it in perl like this:

perl -F':' -ane 'print $F[3],"\n";' /etc/passwd

EDIT: you might still want to get somewhat acquainted with awk, because some other tools are based on its philosophy of pattern-based actions (e.g. DTrace on Solaris).

zvrba
  • 24,186
  • 3
  • 55
  • 65
0

I work in area the files are in column format. So awk is invaluable to me to REFORMAT the file so different software can work together. For non IT profession, using awk is enough and perfect. Now a day, computer speed is not an issue, so I can combine awk & unix to pipe many 1 liners command into a "script". With Awk search by field and record, I use it to check the file data very fast, instead of "vi" to open a file. I have to say awk capability brought joy to my job specially, I am able to assist co-worker to sort things out quickly using awk. Amazing code to me.

0

I have been doing some coding in python at present. But I still do not know it well enough to use easily for simple one off file transformations.

With awk I can quickly develop a one line piece of code on the unix command line that does some pretty swish transformations. Every time I use awk, the piece of code I write will be disposable and no more than a few lines long. Maybe an "if" statment and "printf" statement here or there on the one line.

I have never written a piece of code that is more than 10 lines long with awk. I saw some such scripts years ago.

But anything that required many lines of code, I would resort to python.

I love awk. It is a very powerful tool in combination with sed.

0

if you care anything about speed, but don't wanna be dealing with C/C++ or assembly, you go for awk, specifically, mawk 1.9.9.6.

It also lacks perl's ugly syntax, python3's feature bloat, javascript's annoying UTF16 setup, or C's memory-pointer pitfall traps

Most of the time, for the implementation of the same pseudo-codes, awk only loses against specialized vectorized instructions, like AVX/SSE

RARE Kpop Manifesto
  • 2,453
  • 3
  • 11
0

IMO it’s a tool that has enough features to get things done. In most cases in IT do you really need more.

Simple rule I learned from others

You should never use C if you can do it with a script You should never use a script or scripting language if you can do it with awk; Never use awk if you can do it with sed; Never use sed if you can do it with grep.

Nixbytes
  • 1
  • 1