7

I have nginx receiving POST requests and a tiny PHP script that puts request body to MySql. The problem is very high MySql CPU usage when I have 300 POSTs per second. I expected MySql to be a fast thing that can handle much more that 300 inserts per second. I use Amazon EC2 small instance, Amazon Linux.

top - 18:27:06 up 3 days,  1:43,  2 users,  load average: 4.40, 5.39, 5.76
Tasks: 178 total,   4 running, 174 sleeping,   0 stopped,   0 zombie
Cpu(s): 24.6%us, 13.4%sy,  0.0%ni,  0.0%id,  1.1%wa,  0.0%hi,  4.9%si, 56.0%st
Mem:   1717480k total,  1640912k used,    76568k free,   193364k buffers
Swap:   917500k total,     5928k used,   911572k free,   824136k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 7677 mysql     20   0  313m 153m 6124 S 39.0  9.2 393:49.11 mysqld
16529 nginx     20   0  157m 151m  820 R 15.2  9.0  28:36.50 nginx
29793 php       20   0 36780 3240 1896 S  2.5  0.2   0:00.34 php-fpm
29441 php       20   0 36780 3204 1892 S  2.2  0.2   0:00.78 php-fpm
29540 php       20   0 36780 3204 1900 S  2.2  0.2   0:00.82 php-fpm
29603 php       20   0 36780 3220 1892 S  2.2  0.2   0:00.61 php-fpm
29578 php       20   0 36780 3200 1900 S  1.9  0.2   0:00.42 php-fpm
29950 php       20   0 36780 3192 1900 S  1.9  0.2   0:00.48 php-fpm
30030 php       20   0 36780 3180 1888 S  1.9  0.2   0:00.08 php-fpm
30025 php       20   0 36780 3200 1888 S  1.6  0.2   0:00.11 php-fpm
29623 php       20   0 36780 3184 1892 S  1.3  0.2   0:00.49 php-fpm
29625 php       20   0 36780 3236 1900 S  1.3  0.2   0:00.46 php-fpm
29686 php       20   0 36780 3364 1900 R  1.3  0.2   0:00.51 php-fpm
29863 php       20   0 36780 3184 1892 S  1.3  0.2   0:00.23 php-fpm
30018 php       20   0 36780 3192 1892 S  1.3  0.2   0:00.19 php-fpm
29607 php       20   0 36780 3224 1900 S  1.0  0.2   0:00.42 php-fpm
29729 php       20   0 36780 3180 1888 R  1.0  0.2   0:00.41 php-fpm

Here is my PHP code:

<?php
    $mysqli=new mysqli("localhost", "root", "", "errorreportsraw");
    $project_id=$_REQUEST["project_id"];
    $data=$_REQUEST["data"];
    $date=date("Y-m-d H-i-s");
    $mysqli->query("insert into rawreports(date, data, project_id) values ('$date', '$data', '$project_id')")
?>

I tried mysql_connect, mysql_pconnect, mysqli("localhost",...), mysqli("p:localhost", ...) - still the same. No queries are running against the database except these inserts.

Here's my table:

CREATE TABLE `rawreports` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `date` datetime NOT NULL,
  `data` mediumtext NOT NULL,
  `project_id` varchar(100) NOT NULL,
  PRIMARY KEY (`id`)
);

It's pretty simple, no indexes, just to store the POST data for later processing. 'data' field is something about 3 kilobytes in most cases. Tried innodb and myisam - still the same.

Here's my SHOW PROCESSLIST, nothing except multiple inserts:

mysql> show processlist;
+---------+----------------------+-----------+-----------------+---------+------+------------------+------------------------------------------------------------------------------------------------------+
| Id      | User                 | Host      | db              | Command | Time | State            | Info                                                                                                 |
+---------+----------------------+-----------+-----------------+---------+------+------------------+------------------------------------------------------------------------------------------------------+
| 3872248 | root                 | localhost | NULL            | Query   |    0 | NULL             | show processlist                                                                                     |
| 3901991 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902003 | root                 | localhost | errorreportsraw | Sleep   |    0 |                  | NULL                                                                                                 |
| 3902052 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902053 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902054 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902055 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902056 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902057 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902058 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902059 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902060 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"protocol_version":" |
| 3902061 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902062 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902063 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902064 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902065 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902066 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902067 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902068 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902069 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902070 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902071 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902072 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902073 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902074 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902075 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902076 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902077 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902078 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902079 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902080 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902081 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902082 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902083 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902084 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902085 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902086 | root                 | localhost | errorreportsraw | Query   |    0 | update           | insert into rawreports(date, data, project_id) values ('2012-05-05 17-01-37', '{"exceptions":[{"stac |
| 3902087 | unauthenticated user | localhost | NULL            | Connect | NULL | Reading from net | NULL                                                                                                 |
+---------+----------------------+-----------+-----------------+---------+------+------------------+------------------------------------------------------------------------------------------------------+
39 rows in set (0.00 sec)

Here's the PROFILE when I do the same insert manually while server is still under pressure:

set profiling=1;
insert into rawreports(date, data, project_id) values('2012-05-04 00:58:08','[3000-chars-data-here]','5');
show profile ALL for query 1;

Status                          Duration    CPU_user    CPU_system  Context_voluntary    Context_involuntary    Block_ops_in    Block_ops_out   Messages_sent   Messages_received   Page_faults_major   Page_faults_minor   Swaps   Sourc
starting                        0.000231    0.000000    0.000000    0                    0                      0               0               0               0                   0                   0                   0       NULL    NULL    NULL
checking permissions            0.000030    0.000000    0.000000    0                    0                      0               0               0               0                   0                   0                   0       check_access    sql_parse.cc    4745
Opening tables                  0.000057    0.001000    0.000000    0                    0                      0               0               0               0                   0                   0                   0       open_tables     sql_base.cc     4836
System lock                     0.000030    0.000000    0.000000    0                    0                      0               0               0               0                   0                   0                   0       mysql_lock_tables       lock.cc 299
init                            0.000037    0.000000    0.000000    0                    0                      0               0               0               0                   0                   0                   0       mysql_insert    sql_insert.cc   721
update                          0.075716    0.001999    0.011998    166                  2                      0               0               0               0                   0                   0                   0       mysql_insert    sql_insert.cc   806
Waiting for query cache lock    0.000087    0.000000    0.000000    0                    0                      0               0               0               0                   0                   0                   0       lock    sql_cache.cc    552
update                          0.000037    0.000000    0.000000    0                    0                      0               0               0               0                   0                   0                   0       NULL    NULL    NULL
end                             0.000024    0.000000    0.000000    0                    0                      0               0               0               0                   0                   0                   0       mysql_insert    sql_insert.cc   1049
query end                       0.000042    0.000000    0.000000    0                    0                      0               0               0               0                   0                   0                   0       mysql_execute_command   sql_parse.cc    4434
closing tables                  0.000031    0.000000    0.001000    0                    0                      0               0               0               0                   0                   0                   0       mysql_execute_command   sql_parse.cc    4486
freeing items                   0.000126    0.000000    0.000000    0                    1                      0               0               0               0                   0                   0                   0       mysql_parse     sql_parse.cc    5634
logging slow query              0.000030    0.000000    0.000000    0                    0                      0               0               0               0                   0                   0                   0       log_slow_statement      sql_parse.cc    1460
cleaning up                     0.000024    0.000000    0.000000    0                    0                      0               0               0               0                   0                   0                   0       dispatch_command        sql_parse.cc    1416

I use MySql 5.5.20. Tried both InnoDB and MyISAM - both the same.
Here's my iostat output:

# iostat -x
Linux 3.2.12-3.2.4.amzn1.i686 05/15/2012      _i686_  (1 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          23.67    0.03   18.39    4.09   52.87    0.95

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
xvdap1            0.00     1.74    0.03    0.78     1.50    25.12    32.85     0.01   14.03   5.92   0.48
xvdap3            0.00     0.00    0.01    0.01     0.05     0.08    10.47     0.00    5.72   1.09   0.00
xvdf              0.40    18.59   23.14  117.55   753.12  3465.25    29.98     0.53    3.74   2.38  33.46

The most obvious thing is to batch insert and commit them all together not one by one. But I can't do that because every insert is a separate POST request, separate PHP script execution. They're all executed simultaneously and don't interfer with each other.

Seems to be a pretty simple task, what my CPU is actually doing so hard? Don't have much experience with mysql, php, linux. Probably I just miss something. Thanks for any ideas!

Fedor
  • 43,261
  • 10
  • 79
  • 89
  • 3
    We recently switched away from MYSQL for these tasks—it just isn't efficient at that kind of volume. We're now using Splunk to parse Apache logs, but considered CouchDB, Cassandra, and other key/value solutions as alternatives. – Wandering Digital May 04 '12 at 18:59
  • I agree NoSQL approach could be faster, no transactions overhead, etc. Another option is just to store requests to plain log file for later processing like that http://stackoverflow.com/questions/4939382. I'm considering all these options. But I'm still curious if my current architecture can survive for a while. I would prefer a quick fix now and implement new architecture later when I have more time for that. – Fedor May 04 '12 at 19:10
  • 1
    The "quick fix" can be switching to a larger EC2 instance. – BluesRockAddict May 04 '12 at 19:18
  • 3
    You don't need a blob field for 3K. If you have MySQL 5.0.3+ or higher, you can use VARCHAR(5000) or something like that. It should be some improvement because MySQL can store the data in the row with the rest of the data. – Marcus Adams May 04 '12 at 19:26
  • I said most are 3K but not all. Some may be 9K, some may be more. I need to be flexible to store them all. Can't limit to 5000. – Fedor May 04 '12 at 19:31
  • 2
    Set it to VARCHAR(9000) then. The limit for VARCHAR is 65,535, but you'll reach the row limit (65,535) before then. MEDIUMTEXT has a limit of 65,535 also. – Marcus Adams May 04 '12 at 19:34
  • 2
    Also please fix your SQL injection vulnerability: http://stackoverflow.com/questions/60174/best-way-to-stop-sql-injection-in-php – John Carter May 05 '12 at 11:13
  • 1
    Can you please try the codes I posted in my answer? I am eager to see the results – Your Common Sense May 05 '12 at 11:25
  • Tried VARCHAR(21000) instead of MEDIUMTEXT, still the same. I used MEDIUMTEXT initially because I have international characters so I need UTF8. VARCHAR allows something about 21000 UTF8 chars (65000 chars only for ascii). That's not enough for me, I have 25K-char items and probably more. – Fedor May 05 '12 at 12:05
  • 1
    SHOW PROCESSLIST is okay - no delayed queries. Well, I have to admit that I can't say anything useful. Just one more guess - high CPU load is caused by high disk I/O. – Your Common Sense May 05 '12 at 17:43
  • 5
    Half your CPU time is being eaten up by nginx and PHP. All those threads, and MySQL, are competing for a single core of a single CPU. Get a faster server, or separate the web and the database servers! Look at the sales/budget servers at dedicated hosts, pretty much any old Xeon/Opteron is going to be faster than one virtual CPU at Amazon. – Dan Grossman May 06 '12 at 07:27
  • 1
    Out of curiosity, what is your target? 300 Posts per second doesn't seem unreasonable to me. – Conrad Frix May 11 '12 at 15:05
  • @Conrad I get 500 per second now, but my server is only able to accept 300, everything else is lost. Want it to accept at least 500. I have made some tests and I can already see that it can easily accept 500 if I don't use php+mysql but write to file directly from nginx. So I'm probably going to redesign my architecture and implement new approach. – Fedor May 11 '12 at 17:16
  • What version of MySQL and what storage engine are you using? Also, what does "iostat" report? – Eric Cope May 15 '12 at 05:37
  • Storage engine plays a very big role in write speeds. And check if your number of POSTs is not in size bigger than post_max_size, if it is, you might have to increase the size of post_max_size. And you might consider using a threaded approach to this (php isn't threaded, java is) , it might help. – Flakron Bytyqi May 15 '12 at 08:51
  • @EricCope I've added answers to the question itself. – Fedor May 15 '12 at 10:04

11 Answers11

6

By "later" processing, do you possibly mean 1 hour or 1 day later? If that is the case then I would write the information to a CSV file which you rotate out once an hour or so and then when you need to do your "later"processing you can load the files into MySQL using LOAD DATA INFILE

http://dev.mysql.com/doc/refman/5.1/en/load-data.html

I have had LOAD DATA INFILE load 100s of MB of information in less than a minute, this method would be a great way to speed up your web response.

Michael Blood
  • 1,257
  • 10
  • 11
  • So as I was not able to make MySql work faster I'm going to use raw file approach. That's why I'm awarding this answer. – Fedor May 16 '12 at 12:15
2

Are you running that PHP code in a loop? You can insert more than one row at a time in a single query, and this may help lower the CPU load. All you need to do is provide a comma separated list of values, something like this:

insert into rawreports(date, data, project_id) values (('$date1', '$data1', '$project_id1'),('$date2', '$data2', '$project_id2'), .....)")

also if you are running in a loop, you don't need to repeat the new mysqli() for every iteration.

Brian Glaz
  • 15,468
  • 4
  • 37
  • 55
  • I'm not in a loop. 300 POST requests, 300 php script separate executions. Bulk insert would definitely be faster but I don't know how can I do it. All requests are executed separately. It means I have to accumulate data in memory for some time and then insert them in a batch. Where can they be accumulated? – Fedor May 04 '12 at 19:14
  • is there a reason for doing 300 POST requests? You can't do one POST request with a comma separated list of id's, for example? – Brian Glaz May 04 '12 at 19:16
  • They all come from thousands different IPs. They are crash reports. So you mean I should batch inserts. I need some in-memory cache. Probably a message queue or something like that. It means MySql can't just do the job, some architecture changes are still required. Still can't believe that! – Fedor May 04 '12 at 19:20
  • 1
    Try to store each request with memcache / apc / xcache. And every 1 minute (or so), process all requests and insert them to database. – Sempa May 10 '12 at 08:02
2

no indexes,

That is not quite true.
I'd remove an index in the first place.
In a log-type table it doesn't make any sense.

By the way, you can use textual logs as well. For the later processing.

To get some detailed info you may run these commands from mysql console while your server is under it's usual load:

> SET profiling = 1;
> INSERT an example query
> SHOW PROFILE ALL

also try silly

SHOW PROCESSLIST;

which doubtfully will reveal something useful but at least worth trying

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
  • Remove primary key? Hmm... I won't be able to query specific items from database. I won't be able to SELECT * FROM rawreports WHERE id>@last_processed_id. Probably won't be able to use InnoDB, will have to use MyISAM. Am I correct? – Fedor May 05 '12 at 12:11
  • 1
    heck, you're using bigint. so, this is huge enormous table storing all the requests for all the time, right? I think it's the source of all the problem. Why don't you use some temporary table switched every day and cleaned after all records got processed? anyway, please profile your queries first. – Your Common Sense May 05 '12 at 12:17
  • Being a temorary storage for the delayed processing, this table is not intended to to be queried for the specific items. If still want to have such a general storage (and processed one doesn't fit) you may have an archive table as well, filled by processing script *in batches* – Your Common Sense May 05 '12 at 12:19
  • Have just tried MyISAM table without primary key. Still the same. It means primary key is not the reason of high CPU load. Table size doesn't seem to be a reason neither - if I create a new empty table and start using it nothing changes - CPU is overloaded immediately. Also tried profiling before, saw nothing suspicious there. Will try again and post the results. – Fedor May 05 '12 at 12:33
  • what about show processlist? in my experience it should list many processes in case of heavy cpu load. – Your Common Sense May 05 '12 at 14:10
2

While I agree that using a plain text log file or a NoSQL database are better options, if you're going to use MySQL I think your bottleneck is PHP. Are you using persistent connections and prepared statements? If not, that's a big time waster.

You might try connecting nginx directly to MySQL using the hsock module.

Also note that an EC2 small instance does not have good I/O performance. You need to upgrade to a large to get the kind of I/O performance needed to support 300 posts per second.

Old Pro
  • 24,624
  • 7
  • 58
  • 106
  • yes, he/she tried mysql_pconnect, which means persistent connection. but i'm not seeing prepared statements. +1 for that. – Sliq May 10 '12 at 18:07
  • If you look at the MySQL profile you can see its MySql, not any other process. – Eric Cope May 15 '12 at 13:57
2

Don't use insert data directly to MySQL. Instead create two csv files. Say even_rawreports.csv and odd_rawreports.csv. Now during even hours say(between 2:00 or 2:59) keep logging every POST request in even_rawreports.csvand during odd hours log in odd_rawreports.csv.

Write a cron job that runs hourly and read even_rawreports.csv in odd hours and odd_rawreports.csv in even hours.

In the cron Use following query to load data to mysql directly from the CSV file in one query.

LOAD DATA INFILE 'even_rawreports.csv' INTO TABLE rawreports (col1,col2,...)
    FIELDS TERMINATED BY '|' OPTIONALLY ENCLOSED BY '"';
piyush
  • 976
  • 4
  • 13
  • 28
  • Yes, that's very close to what I want to do. But I'm afraid I can't just write to file because hundred requests will be writing to the same file simultaneously. Need some kind of synchronization between them all? – Fedor May 11 '12 at 06:48
  • I will try to find the solution of concurrent write to a file. Meanwhile you can have a look on http://dev.mysql.com/doc/refman/5.1/en/insert-speed.html to optimize your insert queries. – piyush May 11 '12 at 06:58
2

From all the answers I can see that you can have following choices to deal with the problem after analysis:

1) Amazon EC2 or any cloud server will be having lower write speed than regular dedicated server as HDD iops are shared between all virtual-servers hosted on a physical node. So having separate machine for DB is good idea and you could even give Amazon Simple DB a try. It should be worth giving a try.

2) You can try using simplest document based NoSQL db like MongoDB which has multitudes of speed in write operations compared to MySQL. I have benchmarked that on my local machine and found surprising results. So instead of doing a CSV approach you could really store your DB in NoSQL DB and if required later on push it to MySQL for any relational queries using batch job.

3) Use any PHP opcode/accelerator if you're not using one.

But I think MySQL holds up and slows down and even increases CPU usage because of disk space write speed...

See if below article can be of some help to you: http://kevin.vanzonneveld.net/techblog/article/improve_mysql_insert_performance/

deej
  • 2,536
  • 4
  • 29
  • 51
1

Did you try INSERT DELAYED?

I read that you have 300 posts/sec from different clients but anyway you can use bundle to insert several rows at a time.

Separate logic that collects posts and inserts rows. So you need 2 scripts - first one just collecting all post and saves data to a file for example. Second runs periodically once per second and inserts all the data to a table.

BTW: you can use NOW() to insert current date -> insert into rawreports(date, data, project_id) values (NOW(),

Fedor
  • 43,261
  • 10
  • 79
  • 89
Index
  • 676
  • 1
  • 10
  • 27
1

Introducing another database / binary file to support MySQL seems like a last resort option. Before you get there, I would like you to consider this:

Create a table using the MEMORY Storage Engine which you flush at regular intervals into your main storage.

Ja͢ck
  • 170,779
  • 38
  • 263
  • 309
  • You're right about last resort. And it seems to be the case. I doubt about MEMORY Storage Engine. The memory will be filled to fast with my data volumes. I'll have to flush often. And if I don't flush at the right moment for some reason I will lose significant amount of data. Probably MEMORY will work much faster but seems to be too limiting for me. Anyway thanks for the fresh idea! – Fedor May 16 '12 at 11:53
0

You could try PDO with a prepared statement to take advantage of binary communication between client and server.

Simply change the parameters each time and execute it.

Besides the lower overhead in transmitting to the server, this might help with CPU contention because MySQL wouldn't have to parse the query each time.

You would have to batch (queue) the data to take advantage of this.

Marcus Adams
  • 53,009
  • 9
  • 91
  • 143
  • It's not a loop. Every request is executed separately. I don't know how can I store a prepared statement to be reused by hundreds requests executed simultaneously. – Fedor May 04 '12 at 19:16
  • If you can't reduce the work load, then you'll have to improve the hardware. If CPU were your only concern, you could throttle the number of concurrent users, but then they'd have to wait. – Marcus Adams May 04 '12 at 19:32
  • So you mean 300 is a "workload"? If it is then you're right - we need another hardware or another architecture. – Fedor May 04 '12 at 19:35
  • 1
    Unfortunately, I don't see any improvement to be made in your code or your query (besides moving to VARCHAR). It would have to be an architecture change (bulk inserts or prepared statements and a queue) or a hardware one. MySQL connection pooling would help too. – Marcus Adams May 04 '12 at 19:39
  • I see. What most simple way to implement queue would you recommend? Can it be done inside php or I need another component for that. – Fedor May 04 '12 at 19:47
  • 1
    You could log the posts to timestamped files using PHP's binary file commands. Then regularly run a php script from cron to process these. – Marcus Adams May 04 '12 at 19:58
0

Based on iostat, you on 5.5, I think you are limited by the CPU and disc I/O of your amazon instance. Can you temporarily shift to a faster instance, measure results? There is not much room for improvement. The 75ms update phase of MySQL shows its MySQL killing you, not PHP. I'm surprised 300 is the limit too. I'm not sure of the data integrity requirements, but MongoDB may work well for you and even has a supported PHP library.

Eric Cope
  • 877
  • 1
  • 12
  • 25
0

Have you tried using Archive Storage engine? If you do not need update/delete/replace you should try that option. Please see http://dev.mysql.com/doc/refman/5.0/en/archive-storage-engine.html for more details.

krishnakumarp
  • 8,967
  • 3
  • 49
  • 55
  • It uses zlib compression. It's an additional live data processing, not good for my choking CPU. Anyway thanks for an idea! – Fedor May 16 '12 at 12:03