0

I have tried to dump a database db1 of about 40gb into sql file using mysqldump from system A with innodb default storage engine and tried to restore it on another system B. Both have the default storage engine as innodb and same mysql version . I have checked for any table corruptions on system A using check table status and was not able to find any table corruptions on it. I have used the below query to calculate the table size and no of rows per table on both databases (db1) over system A and system B and found that there was about 6GB data loss on db1 of system B.

SELECT table_schema,
    -> SUM(data_length+index_length)/1024/1024 AS total_mb,
    -> SUM(data_length)/1024/1024 AS data_mb,
    -> SUM(index_length)/1024/1024 AS index_mb,
    -> COUNT(*) AS tables,
    -> CURDATE() AS today
    -> FROM information_schema.tables
    -> GROUP BY table_schema
    -> ORDER BY 2 DESC

Can we rely on information schema for calculating the exact no of rows, exact tablesize (datalength + indexlength) when Innodb is default storage engine ? Why a dump using mysql dump has resulted in significant data loss on restoration over system B ?

Sathish D
  • 4,854
  • 31
  • 44

2 Answers2

1

InnoDB isn't able to give a exact count (using a SELECT COUNT() query) of records found in a table. When you request a record count on a table with the InnoDB engine, you will notice that the count will flucturate.

For more information I would like to refer you to the MySQL developer page for InnoDB http://dev.mysql.com/doc/refman/5.0/en/innodb-restrictions.html

Restrictions on InnoDB Tables

ANALYZE TABLE determines index cardinality (as displayed in the Cardinality column of SHOW INDEX output) by doing eight random dives to each of the index trees and updating index cardinality estimates accordingly. Because these are only estimates, repeated runs of ANALYZE TABLE may produce different numbers. This makes ANALYZE TABLE fast on InnoDB tables but not 100% accurate because it does not take all rows into account.

MySQL uses index cardinality estimates only in join optimization. If some join is not optimized in the right way, you can try using ANALYZE TABLE. In the few cases that ANALYZE TABLE does not produce values good enough for your particular tables, you can use FORCE INDEX with your queries to force the use of a particular index, or set the max_seeks_for_key system variable to ensure that MySQL prefers index lookups over table scans. See Section 5.1.4, “Server System Variables”, and Section C.5.6, “Optimizer-Related Issues”.

SHOW TABLE STATUS does not give accurate statistics on InnoDB tables, except for the physical size reserved by the table. The row count is only a rough estimate used in SQL optimization.

InnoDB does not keep an internal count of rows in a table because concurrent transactions might “see” different numbers of rows at the same time. To process a SELECT COUNT(*) FROM t statement, InnoDB scans an index of the table, which takes some time if the index is not entirely in the buffer pool. If your table does not change often, using the MySQL query cache is a good solution. To get a fast count, you have to use a counter table you create yourself and let your application update it according to the inserts and deletes it does. If an approximate row count is sufficient, SHOW TABLE STATUS can be used. See Section 14.2.12.1, “InnoDB Performance Tuning Tips”.

Community
  • 1
  • 1
Bearwulf
  • 353
  • 1
  • 8
  • Your statement "InnoDB isn't able to give a exact count of records found in a table" is wrong. It is just the wrong way to get the number of rows by looking at the `information_schema` – stollr Apr 01 '14 at 14:13
  • Your right at that, but I refered to the count used in the example. It's is also the reason why I provided the additional information about the limitations of InnoDB. – Bearwulf Apr 01 '14 at 14:18
  • @naitsirch I have updated my answer to clarify the count I am referring to. – Bearwulf Apr 01 '14 at 14:21
  • @Bearwulf Thanks a lot Can we rely on the data_length,index_length reported from the information schema for Innodb ? Is it an exact figure ? if not how can i get the exact table size for every table ? – user3393007 Apr 01 '14 at 14:24
  • @user3393007 When SHOW TABLE STATUS is used, InnoDB will provide the fysical space used by the table. You could use this to determine you overall database size. – Bearwulf Apr 01 '14 at 15:08
0

The best solution to check if you have any data loss, is to compare the contents of your database.

mysqldump --skip-comments --skip-extended-insert -u root -p dbName1 > file1.sql
mysqldump --skip-comments --skip-extended-insert -u root -p dbName2 > file2.sql
diff file1.sql file2.sql

See this topic for more information.

Another advantage of this solution is that you can see where you have the differences.

Community
  • 1
  • 1
stollr
  • 6,534
  • 4
  • 43
  • 59