What are the main differences among them? And in which typical scenarios is it better to use each language?

- 37,270
- 24
- 156
- 208

- 3,277
- 3
- 20
- 17
-
167These type of so called un-constructive questions are really helpful. – Steam Aug 15 '13 at 00:47
-
11Sure, a tab on the front page to find them would be handy... – Nov 30 '13 at 15:08
-
For usefulness of python on the command line, see pyp – Neil McGuigan Jul 06 '16 at 21:26
5 Answers
In order of appearance, the languages are sed
, awk
, perl
, python
.
The sed
program is a stream editor and is designed to apply the actions from a script to each line (or, more generally, to specified ranges of lines) of the input file or files. Its language is based on ed
, the Unix editor, and although it has conditionals and so on, it is hard to work with for complex tasks. You can work minor miracles with it - but at a cost to the hair on your head. However, it is probably the fastest of the programs when attempting tasks within its remit. (It has the least powerful regular expressions of the programs discussed - adequate for many purposes, but certainly not PCRE - Perl-Compatible Regular Expressions)
The awk
program (name from the initials of its authors - Aho, Weinberger, and Kernighan) is a tool initially for formatting reports. It can be used as a souped-up sed
; in its more recent versions, it is computationally complete. It uses an interesting idea - the program is based on 'patterns matched' and 'actions taken when the pattern matches'. The patterns are fairly powerful (Extended Regular Expressions). The language for the actions is similar to C. One of the key features of awk
is that it splits the input automatically into records and each record into fields.
Perl was written in part as an awk-killer and sed-killer. Two of the programs provided with it are a2p
and s2p
for converting awk
scripts and sed
scripts into Perl. Perl is one of the earliest of the next generation of scripting languages (Tcl/Tk can probably claim primacy). It has powerful integrated regular expression handling with a vastly more powerful language. It provides access to almost all system calls and has the extensibility of the CPAN modules. (Neither awk
nor sed
is extensible.) One of Perl's mottos is "TMTOWTDI - There's more than one way to do it" (pronounced "tim-toady"). Perl has 'objects', but it is more of an add-on than a fundamental part of the language.
Python was written last, and probably in part as a reaction to Perl. It has some interesting syntactic ideas (indenting to indicate levels - no braces or equivalents). It is more fundamentally object-oriented than Perl; it is just as extensible as Perl.
OK - when to use each?
- Sed - when you need to do simple text transforms on files.
- Awk - when you only need simple formatting and summarisation or transformation of data.
- Perl - for almost any task, but especially when the task needs complex regular expressions.
- Python - for the same tasks that you could use Perl for.
I'm not aware of anything that Perl can do that Python can't, nor vice versa. The choice between the two would depend on other factors. I learned Perl before there was a Python, so I tend to use it. Python has less accreted syntax and is generally somewhat simpler to learn. Perl 6, when it becomes available, will be a fascinating development.
(Note that the 'overviews' of Perl and Python, in particular, are woefully incomplete; whole books could be written on the topic.)

- 25,269
- 4
- 47
- 72

- 730,956
- 141
- 904
- 1,278
-
2I don't think Python was a reaction to Perl. My understanding is that it started life as a scripting language for Amoeba (a unix-ish research O/S) and was pretty much independent. – ConcernedOfTunbridgeWells Dec 14 '08 at 23:29
-
I agree with NXC, aside from regular expressions there is little similarity between Python and Perl and nothing I have seen to suggest any real relation, inspiration, etc. I see a much closer connection between Ruby and Perl than Python and Perl. – Robert Gamble Dec 14 '08 at 23:49
-
@NXC and Robert Gamble: my intention was to indicate that Python was independent of Perl. I'm not sure how much Guido van Rossum knew about Perl as he was designing Python, but perhaps there's a case for saying that where there were two choices and Perl had taken option A, then Python took option B. – Jonathan Leffler Dec 15 '08 at 03:51
-
1Lol Python and Perl have nothing in common. They share a lot of common features like any other languages, and then don't. – Matt Joiner Feb 20 '10 at 01:50
-
10note the zen of python is basically the antithesis of TMTOWTDI so i would say it could be a reaction to perl. iirc TCL was slightly after perl and is also fairly reactionary against perl, though TCLs reaction is in syntax and language complexity, not ways to do things – jk. Apr 13 '10 at 13:42
-
4@jk: Tcl/Tk was under development in 1987 and first released in 1988; Perl 1.000 was released in December 1987. I don't think Tcl/Tk was a reaction to Perl - it was an independent invention. Python was started in 1989. Perl may not have had much influence on the basics of Python (or Tcl/Tk) after all - except to the extent that any language developments are aware of the existence of other languages (C++, Java, C#, ...). – Jonathan Leffler Apr 13 '10 at 14:58
-
9Whatever the original intentions, it's clear that later Python development and the python community have preferred readability and consistency over Perl's more flexible but terse syntax. Excellent post Jonathan – Martin Beckett May 25 '10 at 14:59
-
@JonathanLeffler - If its only sed vs awk, which one should i learn ? should i do both ? – Steam Aug 20 '13 at 18:27
-
@blasto: If you can, learn both; they both still have their uses as far as I'm concerned, and somewhat different uses. There are things that are easy in `sed` that are not so easy in `awk`, and many more things that are doable in `awk` that aren't doable in `sed`. Apparently, [`sed` is Turing complete](http://www.catonmat.net/blog/proof-that-sed-is-turing-complete/), but that doesn't make it easy to use as a general purpose language. It depends in part on what work you need to do more — editing to transform data (`sed`) or summarizing and formatting data (`awk`). Learning both is best, though. – Jonathan Leffler Aug 20 '13 at 19:12
-
@JonathanLeffler - What would be most useful to an ETL developer ? ETL or Extract Transform and Load is a data-warehousing term. Put crudely, the job involves EXTRACTION of data from different disparate sources (such as DB's, excel files, csv files etc), TRANSFORMATION of the same and then LOADING into a datawarehouse (DW) for analysis, finding patterns in data, or just historical records. eg. End use of a DW - Algorithms applied to a DW of a grocery store which has data from the past 10 years might reveal that people who tend to buy apples also buy oranges or something similar. – Steam Aug 20 '13 at 19:54
-
A link says - http://www.vectorsite.net/tsawk_1.html - "There are, however, things that Awk is not. It is not really well suited for extremely large, complicated tasks." Now, how "large" a task can awk handle ? How does sed compare in that respect ? – Steam Aug 20 '13 at 20:08
-
6@blasto: For ETL, I'd prioritize `awk` over `sed` for learning (though both still have their uses). As to size of task: `sed` is at its finest when it processes one line at a time, with no storage from line to line. `awk` is often used to build up associative arrays with data accumulated from all the sources; it uses more memory, and is therefore much more likely to run into problems with large data sets than `sed` is. I've not heard of `tsawk` before you linked to it. I tend to fall back on Perl (but you might do better with Python) when a task is too much for `awk`. – Jonathan Leffler Aug 20 '13 at 20:11
-
Some slimmed down versions of unix (on routers for example) might not have python or even perl installed. Portability to such systems might be a reason to use something more primitive. – Nov 30 '13 at 15:06
-
1Nice post. However I'd mention that page ( https://swtch.com/~rsc/regexp/regexp1.html ) when talking about using regexp with perl (Perl's regexp are more flexible, and can be much more legible, but awk & grep (& sed?) are infinitely faster for some use cases (and as fast for the others) than their perl/ruby/python equivalents) – Olivier Dulac Feb 12 '16 at 16:33
After mastering a few dozen languages, one gets tired of absolute recommendations against tools, like in this answer regarding sed
and awk
.
Sed is the best tool for extremely simple command-line pipelines. In the hands of a sed master, it's suitable for one-offs of arbitrary complexity, but it should not be used in production code except in very simple substitution pipelines. Stuff like 's/this/that/.'
Gawk (the GNU awk) is by far the best choice for complex data reformatting when there is only a single input source and a single output (or, multiple outputs sequentially written). Since a great deal of real-world work conforms to this description, and a good programmer can learn gawk in two hours, it is the best choice. On this planet, simpler and faster is better!
Perl or Python are far better than any version of awk or sed when you have very complex input/output scenarios. The more complex the problem is, the better off you are using python, from a maintenance and readability standpoint. Note, however, that a good programmer can write readable code in any language, and a bad programmer can write unmaintainable crap in any useful language, so the choice of perl or python can safely be left to the preferences of the programmer if said programmer is skilled and clever.

- 1,561
- 3
- 11
- 27

- 1,057
- 1
- 7
- 2
-
10100% agreed. Knowing most, if not all the tools AND when to use each is what distinguishes a good technician from a mediocre one. – ata Oct 19 '11 at 11:33
-
7I will add that another reason to choose Python or Perl instead of awk is when your transformation requirements involve complex validation or logic for which another language has an existing, robust module. Think about what it would take to properly handle e.g. email or street addresses in awk and you'll see what I mean: perl and python have libraries that make things like this trivial, in awk these are uncommon or unavailable. – sorpigal Jan 30 '12 at 14:27
-
4Actually as Perl was designed to encompass both Sed and Awk; I find it easier to just write it in Perl, rather than learning Sed or Awk. – Brad Gilbert Oct 19 '13 at 17:11
-
3@BradGilbert: like I just mentionned in the top answer, a caveat of Perl(&Python, ruby, etc) over awk is that some kind of regexp are reaaaaaaaaaally slower in the former : https://swtch.com/~rsc/regexp/regexp1.html – Olivier Dulac Feb 12 '16 at 16:35
-
1@OlivierDulac Yes that shows a pathologic case. If you change from `a?ⁿaⁿ` to `a??ⁿaⁿ` then run that in Perl 5 with an `ⁿ` of 1,000,000 it runs in less than two seconds. `time perl -E '$x=1_000_000;$_="a"x$x;$m=("a??"x$x).("a"x$x);say $_=~$m'` If you run the naive one it takes more than two seconds for an `ⁿ` of just 25. The thing you have to realize is Perl has more regex features than those faster ones including allowing you to have Perl code inside of the regex that alters what it matches. You could implement a module that swaps the built-in for one of those others if you want. – Brad Gilbert Feb 12 '16 at 17:38
I wouldn't call sed a fully-fledged programming language, it is a stream editor with language constructs aimed at editing text files programmatically.
Awk is a little more of a general purpose language but it is still best suited for text processing.
Perl and Python are fully fledged, general purpose programming languages. Perl has its roots in text processing and has a number of awk-like constructs (there is even an awk-to-perl script floating around on the net). There are many differences between Perl and Python, your best bet is probably to read the summaries of both languages on something like Wikipedia to get a good grasp on what they are.

- 106,424
- 25
- 145
- 137
-
3I've seen a sed implementation of Sokoban, which would imply Turing Completeness. However, that can also be said of sendmail.cf and TeX. – ConcernedOfTunbridgeWells Dec 14 '08 at 23:30
-
7I worked with a guy once who wrote PostScript to turn a laser printer into a router. – Sam Kington Dec 15 '08 at 04:09
-
11@Sam: Wow! I didn't know a printer's laser could be cranked up enough to cut wood! Oh, sorry, wrong kind of router. – Dennis Williamson Feb 20 '10 at 04:17
-
2sed, not a full-fledged language? Well, that's not entirely true, as [sed is turing complete](http://www.catonmat.net/blog/proof-that-sed-is-turing-complete/) ;) – bernard paulus Feb 20 '13 at 23:05
-
The Awk to Perl script comes with Perl [(a2p)](http://perldoc.perl.org/a2p.html), so does the Sed to Perl script [(s2p)](http://perldoc.perl.org/s2p.html). – Brad Gilbert Oct 19 '13 at 17:03
-
1I've seen an implementation of the forth language in awk. (Since awk can be regarded as a parser by its own right, it is rather straightforward to implement an interpreter in it). – Tatjana Heuser Oct 11 '14 at 22:15
First, there are two unrelated things in the list "Perl, Python awk and sed".
Thing 1 - simplistic text manipulation tools.
sed. It has a fixed, relatively simple scope of work defined by the idea of reading and examining each line of a file. sed is not designed to be particularly readable. It is designed to be very small and very efficient on very tiny unix servers.
awk. It has a slightly less fixed, less simple scope of work. However, the main loop of an awk program is defined by the implicit reading of lines of a source file.
These are not "complete" programming languages. While you can -- with some work -- write fairly sophisticated programs in awk, it rapidly gets complicated and difficult to read.
Thing 2 - general-purposes programming languages. These have a rich variety of statement types, numerous built-in data structures, and no wired-in assumptions or shortcuts to speak of.
Perl.
Python.
When to use them.
sed. Never. It really doesn't have any value in the modern era of computers with more than 32K of memory. Perl or Python do the same things more clearly.
awk. Never. Like sed, it reflects an earlier era of computing. Rather than maintain this language (in addition to all the other required for a successful system), it's more pleasant to simply do everything in one pleasant language.
Perl. Any programming problem of any kind. If you like free-thinking syntax, where there are many, many ways to do the same thing, perl is fun.
Python. Any programming problem of any kind. If you like fairly limited syntax, where there are fewer choices, less subtlety, and (perhaps) more clarity. Python's object-oriented nature makes it more suitable for large, complex problems.
Background -- I'm not bashing sed and awk out of ignorance. I learned awk over 20 years ago. Did many things with it; used to teach it as a core unix skill. I learned Perl about 15 years ago. Did many sophisticated things with it. I've left both behind because I can do the same things in Python -- and it is simpler and more clear.
There are two serious problems with sed and awk, neither of which are their age.
The incompleteness of their implementation. Everything sed and awk do can be done in Python or Perl, often more simply and sometimes faster, too. A shell pipeline has some performance advantages because of its multi-processing. Python offers a
subprocess
module to allow me to recover those advantages.The need to learn yet another language. By doing things in Python (or Perl) your implementation depends on fewer languages, with a resulting increase in clarity.

- 129,424
- 31
- 207
- 592

- 384,516
- 81
- 508
- 779
-
72Some pretty fatuous arguments against awk/sed. The adjustable wrench has not supplanted the open spanner for the same reason sed and awk still ship. Sometimes the simple tool is the best for the job. I write a lot of perl, but for a simple chain of piped commands, awk/sed are quicker than perl -e – RET Dec 14 '08 at 23:19
-
30You can't assume availability of anything but sh, sed and awk on most non-linux unix systems. If you want something to work on an out-of-the-box Solaris, HP/UX or AIX install, you're stuck with sed and awk. – ConcernedOfTunbridgeWells Dec 14 '08 at 23:32
-
1@NXC: not really. Perl and Python are available from the vendors. For example, see http://www-03.ibm.com/systems/p/os/aix/linux/ – S.Lott Dec 14 '08 at 23:59
-
1@RET: "for a simple chain of piped commands" -- sometimes it's quicker to replace the pipe with a simple Python program. – S.Lott Dec 15 '08 at 00:41
-
1@RET: Tried to provide my justification -- been using awk and perl for decades -- I'm not sure what more I can provide for evidence other than my experience. – S.Lott Dec 15 '08 at 02:12
-
30Half of my shell scripts use either sed or awk. They are far from dead. Python is my preferred scripting language, but sometimes sed and awk are the best tool for the job. Just because they have been in use for many years, does not mean they are obsolete. – Jeremy Cantrell Dec 15 '08 at 04:01
-
19@S.Lott: I'm not suggesting that anyone should attempt to build a web-app in awk, but to say they should never be used is a bit outrageous. For a simple s&r and/or tweak (especially to a delimited text file), perl -e or python -c is never going to be as efficient as a sed/awk one-liner. – RET Dec 15 '08 at 05:50
-
@Jeremy Cantrell: I don't believe I said age was the issue. I believe I said incompleteness was the issue. I'll update the answer to emphasize that. – S.Lott Dec 15 '08 at 10:54
-
2@RET: I'm taking a strong position for a reason. They should be looked at as different from perl and python; unrelated. They don't solve a problem I ever have anymore since I started using Python for all scripting. – S.Lott Dec 15 '08 at 10:55
-
2I fully support this answer! I've used sed, Perl and Python heavily. Let sed have peace in its coffin. – Matt Joiner Feb 20 '10 at 02:01
-
28I don't like answers like this. Sed and awk are easy to understand in a few hours and much more lightweight and widely available than a full fledged language. Shell programming is as relevant as ever. – ata Oct 19 '11 at 11:29
-
9@Juaco: "Sed and awk are easy to understand" and they clutter up my limited brain space with yet more syntax and yet more semantic rules. They may be "simple", but it's just two more languages that don't add significant value. "widely available"? Python is available everywhere but Windows by default. Same as sed and awk. The answer never mentioned shell programming. "Never" is an important word. It makes people think about exceptions and special cases that can never be fully enumerated. – S.Lott Oct 19 '11 at 11:45
-
Not going to continue arguing, I believe "subjective" stuff really isn't a goal here. And this conversation is very it. Good luck – ata Nov 09 '11 at 00:36
-
2People keep talking about Python like it's Assembly or something. I haven't used Perl, Awk or SED much, but Python's syntax is a whole lot more flexible than anything else I've used (Java, C++, C#, Vala, Visual Basic, etc.), except for maybe Lua. Lua's pretty flexible. Saying Python's syntax isn't flexible is like saying grass isn't green unless it's in England. Maybe it's less flexible than some languages, but it's still really flexible compared to probably most of the stuff out there. – Brōtsyorfuzthrāx May 06 '14 at 06:10
-
3
-
2@runrig That would be `perl -ane 'print if $F[4] > 100' file.txt` in Perl. Granted it is a little more verbose, but since I already know Perl from using it in much larger projects; I don't have to learn another language to write it. I just have to know a few command line switches. I could also add a separate `-i` switch to cause it to edit the file in place with the Perl version. So you still haven't convinced me that learning another language that has very limited usefulness is worth my time. ( I only figured out what your awk code was doing from running it through `a2p` ) – Brad Gilbert Feb 12 '16 at 18:05
-
@BradGilbert I'm not saying you should go learn awk. I was just refuting the 'never use awk' in the post above. I learned awk long before perl, it's in my toolbox, and I'll use it if I think it solves the problem best. Granted, for anything more complicated, I'll use perl. If a problem becomes more complicated and I need to rewrite in Perl, then at least there's not much to rewrite. – runrig Feb 16 '16 at 20:10
When to use them: awk - never - S. Lott.
I think S. Lott slightly missed the mark with this recommendation. The fact is, on Linux and the other UNIX environments, awk is a useful tool to be used with bash, sh, and ksh for quick text processings. The idea of scripting itself is you solve your problem by gluing together this tool, that tool. Hence in admin scripts, it is common to has ls, grep, |, awk, time, ps, etc. Each is a tool that the scripter combines like a builder brick by brick to finish the building (to solve the problem at hand).
For instance I am a team member of the team managing paintball gear supplies dotcom. This e-commerce site is based on the LAMP stack. For automated processing and normalizing data feeds from various suppliers into the back end database, we employ and maintain a diversified mix of scripts, including bash, perl, php, and even expect. Each has its strengths based on the available modules and API. In the bash scripts we do quick patterns match and appropriate actions on the patterns as needed using awk without the need to switch to PERL. One thing I would also like to point out, which has not been emphasized in the thread, is that a fair number of these scripts were purchased, or gotten from the open source. If the script came as Perl, we maintain it as Perl; if the script came as Php, we maintain it as Php; if it came as bash, we maintain it as bash; we do not re-write it in another language just because we think it is less efficient in the original language.

- 436
- 3
- 8
- 23

- 189
- 1
- 2
-
7as side note on this fairly old answer: never parse the output of `ls`, use glob instead. [read this.](http://mywiki.wooledge.org/ParsingLs) – Jun 30 '12 at 15:33