6
OS=centos 6.7  [Dedicated server]
memory=15G      
cpu=Intel(R) Xeon(R) CPU E5-2403
mysql= V 5.1.73    

Here is a MyISAM table and contains about 5 million rows of data. In every 5-6 minutes data for about 3000 users is inserted (e.g. upload and download rate,session status etc) .

Table information : describe "radacct"

enter image description here

my.cnf

enter image description here

enter image description here

From mysql slow query log one of the query which takes most time is below

Query_time: 7.941773  Lock_time: 0.155912 Rows_sent: 1 Rows_examined: 5377
use freeradius;
SET timestamp=1461582118;
SELECT sum(acctinputoctets) as upload,
       sum(acctoutputoctets) as     download
    FROM radacct a 
    INNER JOIN (SELECT acctuniqueid, MIN( radacctid )     radacctid 
                   FROM radacct 
                   WHERE username='batman215'
                     and acctstarttime  between '2016-02-03 12:10:47'
                                            and '2016-04-25 16:46:01' 
                   GROUP BY   acctuniqueid) b 
        ON a.acctuniqueid = b.acctuniqueid 
       AND a.radacctid =    b.radacctid;

explain query output

enter image description here

when there are many users who try to see their consumed bandwidth the server cannot fulfill the requests due to high load and IO. Is there any thing I can do to further optimize the database ?

Indexes from table "radacct"

enter image description here

Explain query without using \G

enter image description here

Thankyou

Community
  • 1
  • 1
sherpaurgen
  • 3,028
  • 6
  • 32
  • 45
  • 1
    Do show what your indexes are and can you kindly post the explain output horizontally (easier to compare) – e4c5 May 04 '16 at 09:16
  • show your index also. – Dipanwita Kundu May 04 '16 at 09:33
  • You could try to add a multi column index on username and acctstarttime fields. – Shadow May 04 '16 at 09:36
  • @e4c5 updated my question for clarity – sherpaurgen May 04 '16 at 09:37
  • It's much easier if you can avoid posting screen dumps and instead post text. Nevertheless, your question has enough information to answer. – O. Jones May 04 '16 at 10:05
  • @ollie jones I tried to post text (did try to format with correct indentation levels) but couldnt get my question to be accepted by SO checker. – sherpaurgen May 04 '16 at 10:09
  • Reviewing your indices and their cardinality, I notice that the cardinailty for accstarttime is much higher then this of username (seems logic), If you force the use of the index accstarttime, the executionplan should improve. However since you use a between for this index, this effect could cancel out. to be tested – Bee157 May 04 '16 at 10:21

2 Answers2

8

Let's work this out starting with your inner query, which is:

        SELECT acctuniqueid, 
               MIN( radacctid ) radacctid 
          FROM radacct 
         WHERE username='batman215'
           and acctstarttime between '2016-02-03 12:10:47'
                                 and '2016-04-25 16:46:01' 
      GROUP BY   acctuniqueid

You're looking for an equality match on username and a range match on acctstarttime. You're then using acctuniqueid to group and pulling an extreme value (MIN()) from radacctid.

Therefore, to accelerate this subquery, you need the following compound index.

(username, acctstarttime, acctuniqueid, radacctid)

How does this work? Think of an index (these are BTREE indexes) as a sorted list of the values in it.

  1. The query engine random-accesses the list -- fast, O(log(n)) -- to find the first entry matching username and the low end of your BETWEEN range.
  2. It then sequentially scans the list, entry by entry, until it comes to the high end of the BETWEEN range. That's called an index range scan.
  3. As it scans, it looks for each new value of acctuniqueid, in order and then takes the lowest value -- the first one in order -- of radacctid, then skips ahead to the next value of accuniqueid. That's called a loose index scan and it's miraculously cheap.

So, add that compound index. That will probably make a big difference to your query performance.

Your outer query looks like this.

SELECT sum(acctinputoctets) as upload,
       sum(acctoutputoctets) as     download
  FROM radacct a  
 INNER JOIN (  /*an aggregate 
                * yielding acctuniqueid and raddactid
                * naturally ordered on those two columns
                */
           ) b ON a.acctuniqueid = b.acctuniqueid
              AND a.radacctid =     b.radacctid

For this you need the compound covering index

(acctuniqueid, radacctid, acctinputoctets, acctoutputoctets)

This part of the query is also satisfied with index magic.

  1. The first two columns in the index allow the lookup of each row you need, based on the inner query's result.
  2. The query engine can then scan the index adding up the other two columns' values.

(This is called a covering index because it contains some columns that are present just because we want their values, not because we want them indexed. Some other makes and models of DBMS allow extra columns to be included in indexes without making them searchable. That's a little cheaper, especially on INSERT operations. MySQL doesn't do that. )

So, your first action item: add these two compound indexes and retry your query.

It looks like, from your question, that you have placed a lot of single-column indexes on your table in the hopes they will speed things up. That's a notorious antipattern in database design. With respect, you should get rid of any indexes you don't know you need. They don't help queries and they slow down INSERTS. That's your second action item.

Third, go read this http://use-the-index-luke.com/ It's very helpful.

Pro tip: did you see how I formatted your query? Developing a personal formatting convention that clearly shows tables, columns, ON conditions, and other aspects of a query is tremendously important when you have to understand one.

O. Jones
  • 103,626
  • 17
  • 118
  • 172
  • Your server is provisioned more than adequately for this application. This is a case of algorithms (indexes) making a bigger difference than brute firepower. – O. Jones May 04 '16 at 10:37
  • The single column indexes are there so that queries like `select username,framedipaddress,servicetype FROM radacct WHERE username='batman215' ` can make use of it, correct me if not. Thank you I appreciate the time you gave and the valuable insights. – sherpaurgen May 05 '16 at 10:31
  • 1
    A compound index starting with `username` will also accelerate the query you mentioned above. – O. Jones May 05 '16 at 15:57
0
               WHERE username='batman215'
                 and acctstarttime  between ...

begs for INDEX(username, acctstarttime) in that order.

    ON a.acctuniqueid = b.acctuniqueid 
   AND a.radacctid =    b.radacctid;

begs for INDEX(acctuniqueid, radacctid) (in either order) (Or Ollie's covering index).

"In every 5-6 minutes data for about 3000 users is inserted" begs for InnoDB instead of MyISAM. MyISAM does table locks, thereby having the 'insert' interfere with other queries. Conversion tips.

Rick James
  • 135,179
  • 13
  • 127
  • 222