1

I was doing a insert ignore into db1.Table (~140Gb) select * from db2.Table(~32Gb). The process took so long that I killed the query. Then did restart. On restart, mysqld starts first and doesrecover process`.

My tables are fine, and the particular tables are not critical.

How do I stop the recovery process? its been running for 1.5 days now. Some error.log messages

How do I kill/flush page_cleaner or whatever it is that is doing recovery? I looked up what page_cleaner does but didn't quite understand. I don't have any standing issues with the way the server has been running so ideally I wouldn't like to change any settings.

Thanks for your help!

@Wilson Hauck The HDD is a traditional disk. Mysql version is 5.7.27-0ubuntu0.16.04.1

Show full processlist has no running queries

mysql> SHOW GLOBAL VARIABLES LIKE '%size%';

mysql> SHOW GLOBAL VARIABLES like '%capacity%' ;
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| innodb_io_capacity     | 200   |
| innodb_io_capacity_max | 2000  |
+------------------------+-------+
2 rows in set (0.00 sec)

i can't find it in the current error.log but previously i had seen the size to be ~1045mb

results of top

**

edit;

** cleaned up the question for readability, with help from pastebin. Nice site.

Bit background of what i am doing;

It's not just lack of RAM, my processor is i3-3.7Ghz x2. I can't go for something better at this point.

I record financial market derivative data. My stream data recorded is approx raw 7Gb by end of day.

I write the data to blank tables and move the tables to another db at end of day, so i can have blank tables in the morning. I manage my queries by union when I need to. My advantage is, my data is most relevant closer to current date. My algos can't wait for mysql queries, and needs to have info already there, so when I need historical data, eg. 30,60,90 avgs, etc, I have summary tables for that. When I need continuous data, which is usually within 3 days, I have separated data of 5 days.

My current setup works for me. My end of market day, my mysql RAM consumption is about 6Gb if I haven't had to look at historical data much, which is ussually the case. Sometimes there is a lag, but it is later in the day and market has already taken a shape, sort of.

The problem is page_cleaner. It still hasn't finished.

Additional info requested;

SHOW GLOBAL STATUS;

SHOW GLOBAL VARIABLES;

SHOW FULL PROCESSLIST; Since current it is market time, there is no point showing full processlist; as it will have tones of long insert sqls.

complete MySQLTuner report: don't have MySQLTuner

SHOW ENGINE INNODB STATUS;

ulimit -a

iostat

I came across this answer looking for 'how to change innodb_log_file_size'. Will the suggested removal of /var/lib/mysql/ib_logfile file work for me?

Update

The serevr crashed.

I hadn't heard back from you guys, I saw an update for mysql and installed it. I had booted, stop/started before and didn't think applying the update would hurt anything. I was actually hoping it would solve the problem.

Was up all night trying to fix it, to no avail. Whenever I tried to reinstall Mysql, the mysql-server would exit on error, but mysqld was working, the error.log would become 16-20gb. That itself will bring my machine down to a crawl. I think if i waited long enough, it may have been able to repair itself, but I couldn't wait any longer. My dev work is at a standstill.

Once I realized that the tables aren't going to magically reappear, I installed MariaDb, been meaning to try it for a while.

I wish there was a way to isolate such problems. One table brought the server down.

Thanks a ton for your time and help. I did learn alot.

csaw
  • 172
  • 11
  • Your system appears to be stable at this time. Additional information request. Post on pastebin.com and share the links. From SSH login root, Text results of: B) SHOW GLOBAL STATUS; after minimum 24 hours UPTIME C) SHOW GLOBAL VARIABLES; D) SHOW FULL PROCESSLIST; complete, please E) complete MySQLTuner report F) SHOW ENGINE INNODB STATUS; AND Optional very helpful information, if available includes - ulimit -a for a linux/unix list of limits, iostat -xm 5 3 for IOPS by device and core/cpu count, for server workload tuning analysis to provide suggestions. – Wilson Hauck Nov 17 '19 at 18:32
  • Thank you for posting most of the requested info. Page cleaners will continue to show up in the error log until we get some additional adjustments made in your config. MySQLTuner.com has a Perl script available for download that would give you/us another tuning/monitoring tool. The report will include engine types, number of tables and size of data storage used. Your analysis is in process with available data. – Wilson Hauck Nov 18 '19 at 14:37
  • Please post to pastebin.com TEXT results of mpstat -P ALL 5 3 so we can see how many processors are used. Thanks – Wilson Hauck Nov 18 '19 at 15:54
  • What country and time zone is your location? – Wilson Hauck Nov 19 '19 at 13:51
  • @WilsonHauck am in Indian TZ. been restoring data, getting my scripts up. Finding errors, updating them. union qeries are giving problems as some tables haven't been loaded yet. found my raw sql data file size is almost the same as table size, which made me look at the data types i had choosen. since i learned C first i had a propensity to use floats and doubles. Have to update them. – csaw Nov 21 '19 at 09:06
  • I would really really appreciate your help in all points you have mentioned, hyper-thread and my.cnf setting. These setting you have mentioned need extensive knowledge and experience. Whenever I looked them up, everything was 'it depends' so I never messed with them. should have my tables sorted end of day friday. at which point i'm hoping to get intouch with you. Thanx! – csaw Nov 21 '19 at 09:08
  • Looking forward to your use of our website to get in touch. Our home page has a 4-minute video overview of services. Scroll down to 'Contact Us' page. I am looking forward to your contact. – Wilson Hauck Nov 21 '19 at 13:32
  • Let me know when you have time available, – Wilson Hauck Nov 22 '19 at 12:46
  • What is your Skype ID, please. – Wilson Hauck Nov 26 '19 at 17:16
  • csaw, I will be back in the office on Monday, Dec 9, 2019. – Wilson Hauck Dec 06 '19 at 08:14

3 Answers3

2

The recovery was already completed when you saw this line in the log:

2019-11-16T03:07:35.985926Z 0 [Note] InnoDB: Apply batch completed

Then 10 seconds later, you see this in the log:

2019-11-16T03:07:45.962644Z 0 [Note] /usr/sbin/mysqld: ready for connections.

That means all the startup work was complete, and MySQL was ready for normal operations.

The lines about the page-cleaner are not part of recovery. The page-cleaner is going on all the time in InnoDB as normal everyday operation.

2019-11-16T03:15:39.226555Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4545ms. The settings might not be optimal. (flushed=147 and evicted=0, during the time.)

I posted a bunch of details about the page-cleaner in my answers here:

Your innodb_buffer_pool_size is 10GB is probably too small. Your db1.Table and db2.Table alone are 172GB, and I don't know how many other tables you have or the total size of all data + indexes.

At my job, we try to maintain a ratio of 1:10 with respect to buffer pool size and total storage size of data + indexes. The ratio of 1:10 is not any kind of magic formula, it's just the default we use to see if the ratio of buffer to storage is in a reasonable range for typical OLTP workload. You're way off from that ratio with just the first two tables you mentioned.

I also see your innodb_log_file_size is 48MB, which is the default size for this variable in MySQL 5.7. This means you probably never tuned this variable for your workload. The default of 48MB for the InnoDB log is terribly undersized for production databases that need to handle a high volume of writes. At my job, we raise this to 1GB on default instances.

What's the consequence of the log file size being so small? InnoDB must flush pages from the buffer pool to disk if there are more dirty pages in RAM than can be recovered by the log file in cases of a crash (or a deliberate kill, as you did). This flushing is done by the page cleaner.

I think you just have a terribly slow HDD, and it's unable to keep up with the rate of writes you are trying to do.

This explains why your 32GB INSERT took a long time. It would do a small part of the INSERT, creating enough dirty pages in the buffer pool to match the size of the 48GB log file, then InnoDB would pause further INSERTs while it flushed dirty pages.

Every 1 second, the page_cleaner tries to pick some number of dirty pages to flush, enough to complete in 1 second (that's the 1000ms mentioned in the log file line about the page_cleaner). But it couldn't complete that work in 1 second, it took 4.5 seconds (4545ms).

It's doing what it can within the I/O limits of your slow HDD and the low innodb_io_capacity (which is actually sized appropriately for an HDD). Basically, you're trying to force a barrel of water through a 1/2" pipe, and you're wondering why it's taking so long!

If you have a database that's so large, and you're doing operations moving 10's of GB around at a time, you should expect that 10+ year old storage technology is going to take some time to do it.

Here's what I would do:

  1. Upgrade to direct-attached SSD or NVMe storage. Don't use HDD for a database server.

  2. Increase the innodb_buffer_pool_size to be roughly 1/10th the size of your data. I'm making an assumption that your workload is like many other apps, but of course you might need more or less buffer pool depending on your usage of data. But from top I see you only have 16GB of physical RAM on your server, so you should upgrade the RAM first (in other words, don't overallocate the buffer pool relative your physical RAM).

  3. Increase the innodb_log_file_size to 1GB, so InnoDB doesn't have to do flushing so often.

  4. Use a tool like pt-archiver to copy large chunks of data like you were doing. Instead of copying 32GB of data in a single SQL statement, the tool I mentioned will copy small subsets of data from db2.Table, copying it to db1.Table until it's complete. That way each "chunk" of data gets committed promptly, and doesn't need to be recovered if there's a crash. InnoDB is happier when you keep transactions small and commit them promptly. The pt-archiver tool also outputs progress as it's copying, so you have an ETA for when it will complete. This tool is free, it's part of the Percona Toolkit package.

  5. Stop killing things! That doesn't make anything complete any faster.


Re your updated information.

You can't change the innodb log file size on a running MySQL Server, that's why it's a read-only value. You have to change it in the my.cnf file, then restart mysqld. Modern versions of MySQL are smart enough to resize the log files and then restart themselves again, so you don't need to rm or mv the files yourself as in the old answer you found.

WARNING: The log file can be resized only after a clean shutdown. If you try to remove the log files after killing mysqld, I guarantee you will certainly corrupt your database!

You don't need to wait for the page cleaner to finish. The page cleaner is always running — that's what I was trying to tell you above. Normally it is silently running in the background. You can read about it here: https://dev.mysql.com/doc/refman/5.7/en/innodb-buffer-pool-flushing.html

The reason the page cleaner is reporting in the error log is that it can't finish its work each second. You posted iostat results, and I see you're waiting for I/O for a significant amount of time. This is probably your problem.

It's possible to reduce the rate of flushing can be controlled. You have already reduced innodb_lru_scan_depth as low as it will go. You can reduce innodb_io_capacity, you have it set to 200 and its minimum value is 100. See my answer to How to solve mysql warning: "InnoDB: page_cleaner: 1000ms intended loop took XXX ms. The settings might not be optimal "? for an explanation.

Really, you just need a more powerful server if you are working on this much data.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • thanks for taking to time to help. appreciate it. i tried setting innodb_log_file_size to 200mb but it says, read only value. i added some more details to the question too. – csaw Nov 18 '19 at 08:12
  • @BillKarwin Please look at csaw's iostat device names. It was my understanding sda and sdb indicated SSD storage device type. Thanks – Wilson Hauck Nov 18 '19 at 17:40
  • The device names have nothing to do with SSD. https://superuser.com/questions/558156/what-does-dev-sda-for-linux-mean – Bill Karwin Nov 18 '19 at 18:52
1

There were 1,053M rows involved in recovery and were being cleaned up. Let it run, it will get done in time.

The notes about the page cleaners taking time can be minimized in your configuration [mysqld] section with

innodb_lru_scan_depth=100  # from 1024 to conserve 90% of CPU time used by function

this could be set with

SET GLOBAL innodb_lru_scan_depth=100;    after login root in you MySQL Client
Wilson Hauck
  • 2,094
  • 1
  • 11
  • 19
  • the weekend is almost over and the process is still running. i changed the scan depth as suggested, but i would really like it to be over. tomorrow again i will need to add data from a continuous stream and i was reading somewhere that concurrent read/writes plus recovery creates even more dirty pages, hence more stuff to clean up... – csaw Nov 17 '19 at 08:21
  • @csaw Is your data storage to HDD or SSD or NVME? What is result of SELECT @@version; In 20 minutes will be out of my office for 3 hours. – Wilson Hauck Nov 17 '19 at 13:47
  • @csaw Could you post results of A) SHOW GLOBAL VARIABLES LIKE '%size%'; ? and B) SHOW GLOBAL VARIABLES like '%capacity%' ? – Wilson Hauck Nov 17 '19 at 13:50
  • @csaw You would do well to not KILL anything to avoid further complications and more redo processing. You are dealing not with 1,053 rows but 1,053 Million rows and it does take a while. – Wilson Hauck Nov 17 '19 at 13:55
  • @csaw Could you post TEXT results of A) top (first page) and B) SHOW FULL PROCESSLIST; ? – Wilson Hauck Nov 17 '19 at 13:57
  • 1
    thanks for trying to help. Really appreciate your time. Posted the requested info above. Didn't kill anything, figured mysql really doesn't like kill/hard reboot, the hard way. – csaw Nov 17 '19 at 15:39
  • @csaw You are welcome, I have requested additional information (after your original question) for more thorough analysis. Bill Karwin's answer is great and thorough. More RAM will be confirmed as needed with my analysis - his other points should also be adopted when the budget permits for ongoing high performance of your app. – Wilson Hauck Nov 17 '19 at 18:37
  • i posted the info you requested. also pastebin is neat. thanx. – csaw Nov 18 '19 at 08:14
  • @csaw Thank you. – Wilson Hauck Nov 18 '19 at 15:53
0

Rate Per Second=RPS - Suggestions to consider for your my.cnf [mysqld] section

innodb_fast_shutdown=0  # from 1 for CLEAN stop to avoid RECOVERY on start
max_connections=32  # from 151 because max_used_connections=7 to conserve RAM
read_rnd_buffer_size=128K  # from 256K to reduce handler_read_rnd_next RPS of 565
read_buffer_size=256K  # from 128K to reduce handler_read_next RPS of 478
innodb_flushing_avg_loops=5  # from 30 to reduce loop delay and innodb_buffer_pool_pages_dirty of 4,939

Verify Hyper Thread is OFF, including in your BIOS settings. This URL may be of value to you https://askubuntu.com/questions/942728/disable-hyper-threading-in-ubuntu

Keep us posted, please.

Wilson Hauck
  • 2,094
  • 1
  • 11
  • 19