235

My command's output is something like:

1540 "A B"
   6 "C"
 119 "D"

The first column is always a number, followed by a space, then a double-quoted string.

My purpose is to get the second column only, like:

"A B"
"C"
"D"

I intended to use <some_command> | awk '{print $2}' to accomplish this. But the question is, some values in the second column contain space(s), which happens to be the default delimiter for awk to separate the fields. Therefore, the output is messed up:

"A
"C"
"D"

How do I get the second column's value (with paired quotes) cleanly?

kenorb
  • 155,785
  • 88
  • 678
  • 743
Qiang Xu
  • 4,353
  • 8
  • 36
  • 45
  • 1
    http://stackoverflow.com/questions/2961635/using-awk-to-print-all-columns-from-the-nth-to-the-last – martin clayton Apr 21 '13 at 22:39
  • 1
    I tried using `awk '{$1=""; print $0}'`, but it still has a leading white space character. It could be removed by `sed '/^ //'`. Yet, could this be done with `awk`? – Qiang Xu Apr 21 '13 at 23:00

8 Answers8

291

Use -F [field separator] to split the lines on "s:

awk -F '"' '{print $2}' your_input_file

or for input from pipe

<some_command> | awk -F '"' '{print $2}'

output:

A B
C
D
Alex
  • 10,470
  • 8
  • 40
  • 62
  • 3
    This is good, but I also want the original surrounding quotes. Could it be done? Thanks. – Qiang Xu Apr 21 '13 at 22:58
  • 5
    you could cheat, and change awk's print to `'{print "\""$2"\""}'` – Alex Apr 21 '13 at 23:01
  • Yup, this works. Thanks a lot, Alex! By the way, so many quotes, :) – Qiang Xu Apr 21 '13 at 23:06
  • @Alex, could you explain how you used the double quotation marks and backslash to get what the op wanted. – Timo Dec 07 '17 at 09:51
  • 1
    @Timo The quotes and backslashes breakdown can be envisioned as `"\"" + $2 + "\""`. The surrounding quotation marks are indicating something to be appended to the output, and the escaped quotation mark (`\"`) is being printed. To help visualize it, this is what it would look like if we wanted to add blank spaces around `$2` instead of quotation marks: `'{print " "$2" "}'`. We can also add format spacing to make it a little easier to grok: `'{print " " $2 " "}'` – Tom Jan 08 '19 at 22:16
116

If you could use something other than 'awk' , then try this instead

echo '1540 "A B"' | cut -d' ' -f2-

-d is a delimiter, -f is the field to cut and with -f2- we intend to cut the 2nd field until end.

TheAshwaniK
  • 1,706
  • 1
  • 14
  • 15
  • this helped me trying to do following (fetch commit id of a file in git): git annotate myfile.cpp | grep '2016-07' | head -1| cut -f1 – serup Jul 14 '16 at 07:34
  • 5
    This is good, but does not work if delimiter is more than one character long. That's where the awk solution comes in handy – smac89 Sep 01 '17 at 22:49
  • 3
    Why is a space not used after `-d`? It looks a bit odd in that way. – Chris Stryczynski Nov 28 '17 at 12:18
  • if your output has more than one column and you only need the second column, use ```cut -d' ' -f2``` – Ani Apr 22 '21 at 18:27
  • @ChrisStryczynski: You can also do: ` cut -d\ -f2-` (note: two spaces after the back-slash!) Does that look less odd ? – Luuk Jul 25 '22 at 15:00
97

This should work to get a specific column out of the command output "docker images":

REPOSITORY                          TAG                 IMAGE ID            CREATED             SIZE
ubuntu                              16.04               12543ced0f6f        10 months ago       122 MB
ubuntu                              latest              12543ced0f6f        10 months ago       122 MB
selenium/standalone-firefox-debug   2.53.0              9f3bab6e046f        12 months ago       613 MB
selenium/node-firefox-debug         2.53.0              d82f2ab74db7        12 months ago       613 MB


docker images | awk '{print $3}'

IMAGE
12543ced0f6f
12543ced0f6f
9f3bab6e046f
d82f2ab74db7

This is going to print the third column

hemanto
  • 1,900
  • 17
  • 16
35

Or use sed & regex.

<some_command> | sed 's/^.* \(".*"$\)/\1/'
catay
  • 398
  • 2
  • 2
  • Shorter cmd as you do not need start and end markers: ` | sed 's/.* \(".*"\)/\1/'` – Timo Jan 11 '19 at 09:10
  • 1
    It would be nice if anyone could explain a little bit on what `'s/^.* \(".*"$\)/\1/'` stands for – aafulei Jun 22 '22 at 08:01
  • `^` marks the beginning of the line `.*` means any sequence of characters `\(` and `\)` marks a « captured » group of character, what we will treat as \1 in the Right hand side of the sed sequence ".*" means any sequence of chars between quotes $ marks the end of the line – Dimitri Lesnoff Nov 09 '22 at 11:07
23

You don't need awk for that. Using read in Bash shell should be enough, e.g.

some_command | while read c1 c2; do echo $c2; done

or:

while read c1 c2; do echo $c2; done < in.txt
kenorb
  • 155,785
  • 88
  • 678
  • 743
  • You should always use the `-r` argument with read, especially when you don't know what the input will be. Otherwise backslashes will mess things up. – Daniel Griscom Apr 21 '22 at 15:09
13

If you have GNU awk this is the solution you want:

$ awk '{print $1}' FPAT='"[^"]+"' file
"A B"
"C"
"D"
Chris Seymour
  • 83,387
  • 30
  • 160
  • 202
1
awk -F"|" '{gsub(/\"/,"|");print "\""$2"\""}' your_file
Vijay
  • 65,327
  • 90
  • 227
  • 319
0
#!/usr/bin/python
import sys 

col = int(sys.argv[1]) - 1

for line in sys.stdin:
    columns = line.split()

    try:
        print(columns[col])
    except IndexError:
        # ignore
        pass

Then, supposing you name the script as co, say, do something like this to get the sizes of files (the example assumes you're using Linux, but the script itself is OS-independent) :-

ls -lh | co 5

mate00
  • 2,727
  • 5
  • 26
  • 34