I have a amazon ec2 instance(SAY S1)(4core-7GB memory) using Ubuntu 12.04, which is running my web app with postgresql 9.1
. All the postgres data is stored on to a different ssd volume(not root) of 100 GB. (Write now its currently 26% full only ) .
Suddenly from a day or two few postgres actions started taking lot of time. Create command (52 seconds) and restore a db (9 mins now, previously 50 seconds max ) .
By running iostat while running postgres commands I can confirm that its IOPS of ec2 volume which has reached to its limit (3 IOPS/GB equals 300 IOPS for 100GB volume ) . Can see it in below after running this command iostat -d 5 -x -p xvdf
.
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdf 0.35 2.28 1.20 298.99 19.65 13082.19 87.29 23.42 78.03 64.19 78.09 3.29 98.75
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdf 0.00 1.80 0.00 297.40 0.00 13067.20 87.88 126.47 420.75 0.00 420.75 3.35 99.76
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdf 0.00 1.80 0.00 297.40 0.00 13067.20 87.88 126.32 417.95 0.00 417.95 3.35 99.76
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdf 0.00 1.80 0.00 297.80 0.00 13093.60 87.94 131.70 440.82 0.00 440.82 3.36 100.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdf 0.00 0.00 0.00 301.00 0.00 13225.60 87.88 129.36 422.97 0.00 422.97 3.32 99.84
IO characteristics on aws says that each IOPS takes a request of 256KiB or less so is the postgres using smaller blocks of data to write back resulting more number of IOPS request ?
While I have another ec2 instance(Say S2) with 100GB volume(95% full right now ) with postgres data being on root volume and its performing great. So size of volume is something I am sure doesn't matter here .
The affected volume of S1 only store postgres data still I can see the below stats by iostat. Not sure why the stats are like that and how could I reduce the postgres commands time without increasing the size of the volume . (While all operations 3GB memory is always free )
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdf 0.34 2.29 1.23 298.93 20.10 13079.03 87.28 26.19 87.26 66.96 87.34 3.29 98.78
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdf 0.00 2.40 0.60 299.00 4.80 13020.80 86.95 132.22 434.48 108.00 435.14 3.34 100.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdf 0.00 3.20 4.40 295.20 43.20 12866.40 86.18 122.18 417.09 142.00 421.20 3.34 100.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdf 0.00 2.80 2.40 297.20 23.20 12940.00 86.54 122.70 401.11 124.00 403.34 3.34 99.92
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdf 0.00 3.40 4.80 294.80 46.40 12840.00 86.02 127.43 433.15 161.67 437.57 3.34 99.92
Note : The affected volume of postgres contains 100 different postgres db with average size of 110 MB/db (but honestly I don't think this is in any case a problem )