get string from lines of file in bash

Question

I have these lines in file :

postgres  2609 21030  0 12:49 ?        00:00:00 postgres: postgres postgres [local] idle in transaction                                                                     
postgres  2758 21030  0 12:51 ?        00:00:00 postgres: postgres postgres [local] idle in transaction                                                                     
postgres 28811 21030  0 09:26 ?        00:00:00 postgres: postgres postgres [local] idle in transaction                                                                     
postgres 32200 21030  0 11:40 ?        00:00:00 postgres: postgres postgres [local] idle in transaction                                                                     
postgres 32252 21030  0 11:41 ?        00:00:00 postgres: postgres postgres [local] idle in transaction

I need to separate second column values to process them.I have done this code :

pid=$(cat idle_log.txt | cut -d" " -f2)
echo $pid

but it only gave me 28811 32200 32252 in results.as you see there is no trace of 2609 2758 in list,I want to get them too. Also I want count them after extracting pids. I used :

npid=$(grep -o " " <<< $pid | grep -c .)

it returns 2 for results of 28811 32200 32252 I need it return 3 as count of processes. finally I want to process some thing line by line like in a loop with while but output of commands return results at once,and I can't process them in loop format and one by one.

thank you all for help.

See: [How to get the second column from command output?](http://stackoverflow.com/q/16136943/3776858) — Cyrus, May 16 '16 at 09:46

heemayl · Answer 1 · 2016-05-16T09:45:52.213

You can use tr to squeeze the spaces and then use cut to take the second space delimited field:

tr -s ' ' <idle_log.txt | cut -d' ' -f2

Or awk:

awk '{ print $2 }' idle_log.txt

Or sed:

sed -r 's/^[^[:blank:]]+[[:blank:]]+([^[:blank:]]+)(.*)/\1/' idle_log.txt

Or grep:

grep -Po '^[^\s]+\s+\K[^\s]+' idle_log.txt

To use/count them later use an array:

pids=( $(tr -s ' ' <idle_log.txt | cut -d' ' -f2) )

num_of_pids="${#pids[@]}"

$ printf '%s\n' "${pids[@]}" 
2609
2758
28811
32200
32252

Example:

$ tr -s ' ' <file.txt | cut -d' ' -f2 
2609
2758
28811
32200
32252

$ awk '{ print $2 }' file.txt        
2609
2758
28811
32200
32252

$ sed -r 's/^[^[:blank:]]+[[:blank:]]+([^[:blank:]]+)(.*)/\1/' file.txt
2609
2758
28811
32200
32252

$ grep -Po '^[^\s]+\s+\K[^\s]+' file.txt
2609
2758
28811
32200
32252

hmmm,nice trick but I still need to get values like one at each time . — Ali_T, May 16 '16 at 09:39
thank you for your time I just used the first answer that was more close to my case. — Ali_T, May 16 '16 at 10:30

riteshtch · Accepted Answer · 2016-05-16T09:58:55.533

1

$ cat data 
postgres  2609 21030  0 12:49 ?        00:00:00 postgres: postgres postgres [local] idle in transaction
postgres  2758 21030  0 12:51 ?        00:00:00 postgres: postgres postgres [local] idle in transaction
postgres 28811 21030  0 09:26 ?        00:00:00 postgres: postgres postgres [local] idle in transaction
postgres 32200 21030  0 11:40 ?        00:00:00 postgres: postgres postgres [local] idle in transaction
postgres 32252 21030  0 11:41 ?        00:00:00 postgres: postgres postgres [local] idle in transaction   I need to extract second column from each line, 
$ awk '{print $2}' data 
2609
2758
28811
32200
32252

or you can squeeze multiple spaces into 1 using tr and then use cut like this:

$ tr -s ' ' < data | cut -d ' ' -f 2
2609
2758
28811
32200
32252

Edit:

$ tr -s ' ' < data | cut -d ' ' -f 2 | while read -r line || [[ -n "$line" ]]; do
> echo "$line" #put your custom processing logic here
> done
2609
2758
28811
32200
32252

edited May 16 '16 at 09:58

answered May 16 '16 at 09:35

riteshtch

8,629
4
25
38

Thanks,have you any idea to not output them at once ? I need to read values line by line . . . – Ali_T May 16 '16 at 09:38
I want to calculate elapsed time of Idle in transaction processes and if they passed five minutes I should send mail to its owner to check it. – Ali_T May 16 '16 at 10:00
@Ali_T yes you can write that logic where i have used `echo ..` – riteshtch May 16 '16 at 10:01
just one thing I want to pass "tr . . . " result to variable as "pid" to get the elapsed time,but while is pipelined with it...any idea ? – Ali_T May 16 '16 at 10:14
@Ali_T if you store in a variable called `pid`, you will have to use the while loop on that variable `pid` not in the above format – riteshtch May 16 '16 at 10:17
It was ok I used $line as my variable,and worked just fine. – Ali_T May 16 '16 at 10:19

Jahid · Answer 3 · 2016-05-16T09:47:59.653

1

grep with Perl regex:

grep -oP '^[\S]+\s+\K[\S]+' file
2609
2758
28811
32200
32252

Or,

grep -o '^\([^[:blank:]]*[[:blank:]]*\)\{2\}' file |grep -o '[0-9]\+'
2609
2758
28811
32200
32252

edited May 16 '16 at 09:47

answered May 16 '16 at 09:40

Jahid

21,542
10
90
108

score 0 · Answer 4 · answered May 16 '16 at 09:38

0

cut uses exactly the delimiter you pass it. That means with delimiter ' ', the first line is:

postgres, <empty>, 2609

And the last one is:

postgres, 32252

You can simplify this by running just awk '{print $2}' idle_log.txt

answered May 16 '16 at 09:38

viraptor

33,322
10
107
191

score 0 · Answer 5 · answered May 16 '16 at 10:57

0

I'd go for the simplest solution:

pid=$(awk '{print $2}' idle_log.txt)
echo $pid

The regex stuff for sed and grep are much less readable in a script while cut and tr may sometimes have unexpected results.

answered May 16 '16 at 10:57

louigi600

716
6
16

score 0 · Answer 6 · answered May 16 '16 at 12:45

As has already been pointed out, the reason, you were not getting, the results is that you were not extracting the second column.

Instead, you were using the command cut -d" " -f2 so you were getting the second tablespaced split of each line . You may see that there is an additional tablespace for the two first lines so you should use cut -d" " -f3 but as discussed, this is not the right way to get the second column. Use awk '{print $2}' instead.

get string from lines of file in bash

6 Answers6