1

I have these lines in file :

postgres  2609 21030  0 12:49 ?        00:00:00 postgres: postgres postgres [local] idle in transaction                                                                     
postgres  2758 21030  0 12:51 ?        00:00:00 postgres: postgres postgres [local] idle in transaction                                                                     
postgres 28811 21030  0 09:26 ?        00:00:00 postgres: postgres postgres [local] idle in transaction                                                                     
postgres 32200 21030  0 11:40 ?        00:00:00 postgres: postgres postgres [local] idle in transaction                                                                     
postgres 32252 21030  0 11:41 ?        00:00:00 postgres: postgres postgres [local] idle in transaction                                                                     

I need to separate second column values to process them.I have done this code :

pid=$(cat idle_log.txt | cut -d" " -f2)
echo $pid

but it only gave me 28811 32200 32252 in results.as you see there is no trace of 2609 2758 in list,I want to get them too. Also I want count them after extracting pids. I used :

npid=$(grep -o " " <<< $pid | grep -c .)

it returns 2 for results of 28811 32200 32252 I need it return 3 as count of processes. finally I want to process some thing line by line like in a loop with while but output of commands return results at once,and I can't process them in loop format and one by one.

thank you all for help.

Ali_T
  • 27
  • 1
  • 1
  • 6
  • See: [How to get the second column from command output?](http://stackoverflow.com/q/16136943/3776858) – Cyrus May 16 '16 at 09:46

6 Answers6

3

You can use tr to squeeze the spaces and then use cut to take the second space delimited field:

tr -s ' ' <idle_log.txt | cut -d' ' -f2

Or awk:

awk '{ print $2 }' idle_log.txt

Or sed:

sed -r 's/^[^[:blank:]]+[[:blank:]]+([^[:blank:]]+)(.*)/\1/' idle_log.txt

Or grep:

grep -Po '^[^\s]+\s+\K[^\s]+' idle_log.txt

To use/count them later use an array:

pids=( $(tr -s ' ' <idle_log.txt | cut -d' ' -f2) )

num_of_pids="${#pids[@]}"

$ printf '%s\n' "${pids[@]}" 
2609
2758
28811
32200
32252

Example:

$ tr -s ' ' <file.txt | cut -d' ' -f2 
2609
2758
28811
32200
32252

$ awk '{ print $2 }' file.txt        
2609
2758
28811
32200
32252

$ sed -r 's/^[^[:blank:]]+[[:blank:]]+([^[:blank:]]+)(.*)/\1/' file.txt
2609
2758
28811
32200
32252

$ grep -Po '^[^\s]+\s+\K[^\s]+' file.txt
2609
2758
28811
32200
32252
heemayl
  • 39,294
  • 7
  • 70
  • 76
1
$ cat data 
postgres  2609 21030  0 12:49 ?        00:00:00 postgres: postgres postgres [local] idle in transaction
postgres  2758 21030  0 12:51 ?        00:00:00 postgres: postgres postgres [local] idle in transaction
postgres 28811 21030  0 09:26 ?        00:00:00 postgres: postgres postgres [local] idle in transaction
postgres 32200 21030  0 11:40 ?        00:00:00 postgres: postgres postgres [local] idle in transaction
postgres 32252 21030  0 11:41 ?        00:00:00 postgres: postgres postgres [local] idle in transaction   I need to extract second column from each line, 
$ awk '{print $2}' data 
2609
2758
28811
32200
32252

or you can squeeze multiple spaces into 1 using tr and then use cut like this:

$ tr -s ' ' < data | cut -d ' ' -f 2
2609
2758
28811
32200
32252

Edit:

$ tr -s ' ' < data | cut -d ' ' -f 2 | while read -r line || [[ -n "$line" ]]; do
> echo "$line" #put your custom processing logic here
> done
2609
2758
28811
32200
32252
riteshtch
  • 8,629
  • 4
  • 25
  • 38
  • Thanks,have you any idea to not output them at once ? I need to read values line by line . . . – Ali_T May 16 '16 at 09:38
  • I want to calculate elapsed time of Idle in transaction processes and if they passed five minutes I should send mail to its owner to check it. – Ali_T May 16 '16 at 10:00
  • @Ali_T yes you can write that logic where i have used `echo ..` – riteshtch May 16 '16 at 10:01
  • just one thing I want to pass "tr . . . " result to variable as "pid" to get the elapsed time,but while is pipelined with it...any idea ? – Ali_T May 16 '16 at 10:14
  • @Ali_T if you store in a variable called `pid`, you will have to use the while loop on that variable `pid` not in the above format – riteshtch May 16 '16 at 10:17
  • It was ok I used $line as my variable,and worked just fine. – Ali_T May 16 '16 at 10:19
1

grep with Perl regex:

grep -oP '^[\S]+\s+\K[\S]+' file
2609
2758
28811
32200
32252

Or,

grep -o '^\([^[:blank:]]*[[:blank:]]*\)\{2\}' file |grep -o '[0-9]\+'
2609
2758
28811
32200
32252
Jahid
  • 21,542
  • 10
  • 90
  • 108
0

cut uses exactly the delimiter you pass it. That means with delimiter ' ', the first line is:

postgres, <empty>, 2609

And the last one is:

postgres, 32252

You can simplify this by running just awk '{print $2}' idle_log.txt

viraptor
  • 33,322
  • 10
  • 107
  • 191
0

I'd go for the simplest solution:

pid=$(awk '{print $2}' idle_log.txt)
echo $pid

The regex stuff for sed and grep are much less readable in a script while cut and tr may sometimes have unexpected results.

louigi600
  • 716
  • 6
  • 16
0

As has already been pointed out, the reason, you were not getting, the results is that you were not extracting the second column.

Instead, you were using the command cut -d" " -f2 so you were getting the second tablespaced split of each line . You may see that there is an additional tablespace for the two first lines so you should use cut -d" " -f3 but as discussed, this is not the right way to get the second column. Use awk '{print $2}' instead.