10

I have a table in MySql server and the table contains around 1M rows. Only because of one column table is taking more disk space day by day. The datatype of this column is Mediumblob. Table size is around 90 GB.

After each row insertion, I do some processing then after I don't really require this column.

So for this column, if I set the value to NULL after processing the row, does MySql utilizes this empty space for next row insertion or not?

MySql Server details

Server version: 5.7

Engine: InnoDB

Hosting: Google Cloud Sql

EDIT 1: I deleted 90% of rows from table then I ran OPTIMIZE TABLE table_name but it has reduced only 4GB of disk space and it is not reclaiming the free disk space.

EDIT 2 I even deleted my database and created new DB and table but MySql server still showing 80GB disk space. Sizes of all databases of MySQL server

SELECT table_schema "database name",
         sum( data_length + index_length ) / 1024 / 1024 "database size in MB",
         sum( data_free )/ 1024 / 1024 "free space in MB"
     FROM information_schema.TABLES
     GROUP BY table_schema; 

+--------------------+---------------------+------------------+
| database name      | database size in MB | free space in MB |
+--------------------+---------------------+------------------+
| information_schema |          0.15625000 |      80.00000000 |
| app_service        |         15.54687500 |       4.00000000 |
| mysql              |          6.76713467 |       2.00000000 |
| performance_schema |          0.00000000 |       0.00000000 |
| sys                |          0.01562500 |       0.00000000 |
+--------------------+---------------------+------------------+

Thanks

Rams
  • 2,141
  • 5
  • 33
  • 59

1 Answers1

5

Edit: It turns out from comments below that the user's binary logs are the culprit. It makes sense that the binary logs would be large after a lot of DELETEs, and assuming that the MySQL instance is using row-based replication.


The answer is complex.

You can save space by using NULL instead of real values. InnoDB uses only 1 bit per column per row to indicate that the value is NULL (see my old answer to https://stackoverflow.com/a/230923/20860) for details.

But this will just make space in the page where that row was stored. Each page must store only rows from the same table. So if you set a bunch of them NULL, you make space in that page, which can be used for subsequent inserts for that table only. It won't use the gaps for rows that belong to other tables.

And it still may not be reused for any rows of your mediumblob table, because InnoDB stores rows in primary key order. The pages for a given table don't have to be consecutive, but I would guess the rows within a page may be. In other words, you might not be able to insert rows in primary key random order within a page.

I don't know this detail for certain, you'd have to read Jeremey Cole's research on InnoDB storage to know the answer. Here's an excerpt:

The actual on-disk format of user records will be described in a future post, as it is fairly complex and will require a lengthy explanation itself.

User records are added to the page body in the order they are inserted (and may take existing free space from previously deleted records), and are singly-linked in ascending order by key using the “next record” pointers in each record header.

It's still not quite clear whether rows can be inserted out of order, and reuse space on a page.

So it's possible you'll only accomplish fragmenting your pages badly, and new rows with high primary key values will be added to other pages anyway.

You can do a better effort of reclaiming the space if you use OPTIMIZE TABLE from time to time, which will effectively rewrite the whole table into new pages. This might re-pack the rows, fitting more rows into each page if you've changed values to NULL.

It would be more effective to DELETE rows you don't need, and then OPTIMIZE TABLE. This will eliminate whole pages, instead of leaving them fragmented.

Community
  • 1
  • 1
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • 2
    I did optimize table but it still not reclaiming the free disk space. I have updated my question please check. – Rams Jan 09 '18 at 09:35
  • 1
    Please check `SELECT @@innodb_file_per_table;` If it is 0, you won't be able to reduce disk usage. But you will be able to reuse the space you free up by doing delete and drop, before it grows the space any further. – Bill Karwin Jan 09 '18 at 16:48
  • file_per_table is 1 , Looks like the problem is with binary logs. SHOW BINARY LOGS returned 712 rows, cumulatively it contains more than 75 GB. – Rams Jan 09 '18 at 17:16
  • Oh yeah, that would do it. You should be able to use [PURGE BINARY LOGS](https://dev.mysql.com/doc/refman/5.7/en/purge-binary-logs.html) to get rid of any you don't need. Though I haven't used Google Cloud, so I don't know if they give you privilege to run that command, or if you have to do it through the GUI. – Bill Karwin Jan 09 '18 at 17:41
  • I will check and update you. Is it safe to apply OPTIMIZE TABLE command on production table? (5.7 MySql), for more details can you check this https://stackoverflow.com/questions/48165511/optimize-mysql-5-7-table-without-downtime – Rams Jan 09 '18 at 17:45
  • I have posted an answer on that question. – Bill Karwin Jan 09 '18 at 17:52
  • Thank you for the answer – Rams Jan 10 '18 at 09:42