Extract column after pattern from file

Question

I have a sample file which looks like this:

5    6    7    8
55   66   77   88

A    B    C    D
1    2    3    4
2    4    6    8
3    8    12   16

E    F    G    H
11   22   33   44
and so on...

I would like to enter a command in a bash script or just in a bash terminal to extract one of the columns independently of the others. For instance, I would like to do something like a grep/awk command with the pattern=C and get the following output:

How can I extract a specific column independent of the others and also put a # of lines to extract after the pattern so that I don't get the above column with the 7's or the G column in my output?

I have tried 'grep -A 3 "C" filename' which gives the desired rows after the pattern but it gives all the columns. — Xatticus00, Apr 26 '16 at 16:23
You should post a subset of relevant data instead of `some stuff .... ` as that can change the dynamics of the answer. Also, while you contemplate doing that, I would recommend going through this [answer](http://stackoverflow.com/a/17914105/970195) to try something yourself first. — jaypal singh, Apr 26 '16 at 16:37
Oops, you forgot to post your code! StackOverflow is about helping people fix their code. It's not a free coding service. Any code is better than no code at all. Meta-code, even, will demonstrate how you're thinking a program should work, even if you don't know how to write it. — ghoti, Apr 26 '16 at 17:04
I've made the changes. I have seen the answer that you linked to. I played around with those tools for a while but couldn't get the column part to work. This is why I thought an 'awk' command would work but couldn't find how to specify a column. Thank you for your comment and help. — Xatticus00, Apr 26 '16 at 17:06

score 2 · Accepted Answer · answered Apr 26 '16 at 16:31

2

If it's always 3 records after the found term:

awk '{for(i=1;i<=NF;i++) {if($i=="C") col=i}} col>0 && rcount<=3 {print $col; rcount++}' test

This will look at each field in your record and if it finds a "C", it will capture the column number i. If the column number is greater than 0 then it will print the contents of the column. It counts up to 3 records and then stops printing.

answered Apr 26 '16 at 16:31

JNevill

46,980
4
38
63

This works perfectly thank you! I have a follow up question though. I would like to be able to specify the pattern and the # of records afterward in the command if I make it a script how can I accomplish that? I've tried replacing "C" with $2 and rcount<=3 with recount<=$3 but they don't seem to be recognized. I think they are being interpreted literally as $2 and $3 instead of the second and third arguments in the command. – Xatticus00 Apr 26 '16 at 16:56
You have to sort of "Send" the variable into the awk script from your bash/sh/csh/whatevs scripts. The way to do this is with awk's `-v` flag: `awk -v awkvar=$bashvar -v awkvar2=$bashvar2 '{print awkvar, awkvar2}` – JNevill Apr 26 '16 at 16:59
@jaypalsingh That's assuming that the column number is known, which it isn't. Furthermore, your variable isn't being used in your script, but I assume that the `3` should be subbed with `num`. – JNevill Apr 26 '16 at 17:20
wrt `You have to sort of "Send" the variable into the awk script` - that makes it sound complicated when it isn't. You have to initialize your awk variables with the values of your shell variables or with constant values, that's all. – Ed Morton Apr 26 '16 at 19:56

score 1 · Answer 2 · answered Apr 26 '16 at 19:49

1

$ cat tst.awk
!prevNF { delete f; for (i=1; i<=NF; i++) f[$i] = i }
NF && (tgt in f) { print $(f[tgt]) }
{ prevNF = NF }

$ awk -v tgt=C -f tst.awk file
C
3
6
12

$ awk -v tgt=F -f tst.awk file
F
22

answered Apr 26 '16 at 19:49

Ed Morton

188,023
17
78
185

Extract column after pattern from file

2 Answers2