Saturday, November 14, 2015

Changes in InnoDB OPTIMIZE TABLE behavior in MySQL 5.7

The OPTIMIZE TABLE statement provides allows MySQL DBAs to reorganize physical table storage in order to achieve two main goals:
  1. Improve IO efficiency for reads/writes against the table,
  2. Reclaim unused storage space.
For InnoDB tables, OPTIMIZE is implemented via ALTER TABLE operation. This is true in all MySQL versions, however MySQL 5.7.4 introduced a change that may catch many users off guard and is therefore worth highlighting.

Theory

The InnoDB storage engine does not directly support OPTIMIZE. Instead, when a DBA invokes OPTIMIZE TABLE x, InnoDB executes ALTER TABE x FORCE. In MySQL versions prior to 5.7.4, OPTIMIZE is not an online operation i.e. the table is locked and DML operations against the table are not allowed until OPTIMIZE finishes.

MySQL 5.7.4 introduces a significant change around OPTIMIZE, namely the ALTER TABLE operation now uses online DDL i.e. we can use algorithm INPLACE instead of COPY.

This is certainly great news for DBAs who wouldn't otherwise afford to stop DMLs for as long as the OPTIMIZE may take. Online DDL is now used by default and you cannot force it to use the old-fashioned locking implementation. But why would you even want to do that?

Practice

Online OPTIMIZE is a nice thing to have but it does come at a certain price, particularly when we use it solely as a way to reclaim unused disk space. Let's have a look at some examples.

Note: InnoDB tablespace analysis was performed using tools from "innodb_ruby" project by Jeremy Cole.

Testing methodology

OPTIMIZE tests were done on MySQL 5.6.27 and 5.7.9, using a schema named "repro" and an InnoDB table of the following structure:

CREATE TABLE repro.table1 (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `i1` int(11) DEFAULT NULL,
  `i2` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `i1` (`i1`)
);

The table was filled by random-ish data (NULL for the first and rand() for the other two columns) using batch inserts (INSERT INTO table1 SELECT ... FROM table1).

Results

Results are presented below and discussed at the end of the article.

Table metadata: empty table, MySQL 5.6.27 and 5.7.9

mysql@sandbox:$ ls -l
total 128
-rw-rw---- 1 mysql mysql     65 Nov  1 21:12 db.opt
-rw-rw---- 1 mysql mysql   8608 Nov 11 21:50 table1.frm
-rw-rw---- 1 mysql mysql 114688 Nov 11 21:50 table1.ibd

mysql@sandbox:$ innodb_space -f table1.ibd space-page-type-summary
type                count       percent     description
ALLOCATED           2           28.57       Freshly allocated
INDEX               2           28.57       B+Tree index
INODE               1           14.29       File segment inode
IBUF_BITMAP         1           14.29       Insert buffer bitmap
FSP_HDR             1           14.29       File space header

Notes:
  • Minimal state of InnDB tablespace with 5 pages(112KB) total.
  • Nothing really exciting yet, both MysQL versions yield identical results.

Table metadata: 1M rows, MySQL 5.6.27 and 5.7.9 (before OPTIMIZE)

mysql@sandbox:$ ls -l
total 61460
-rw-rw---- 1 mysql mysql       65 Nov  1 21:12 db.opt
-rw-rw---- 1 mysql mysql     8608 Nov 11 21:50 table1.frm
-rw-rw---- 1 mysql mysql 62914560 Nov 11 21:53 table1.ibd

mysql@sandbox:$ innodb_space -f table1.ibd space-page-type-summary
type                count       percent     description
INDEX               3117        81.17       B+Tree index
ALLOCATED           720         18.75       Freshly allocated
INODE               1           0.03        File segment inode
IBUF_BITMAP         1           0.03        Insert buffer bitmap
FSP_HDR             1           0.03        File space header

Notes:
  • Tablespace size is 60MB,
  • Still not much to talk about, we have some data in the table. Both MySQL versions allocate pages in the same way and give us tablespaces of the same size.

Table metadata: 1M rows, MySQL 5.6.27 after OPTIMIZE

mysql@sandbox:$ ls -l
total 53268
-rw-rw---- 1 mysql mysql       65 Nov  1 21:12 db.opt
-rw-rw---- 1 mysql mysql     8608 Nov 11 21:55 table1.frm
-rw-rw---- 1 mysql mysql 54525952 Nov 11 21:55 table1.ibd

mysql@sandbox:$ innodb_space -f table1.ibd space-page-type-summary
type                count       percent     description
INDEX               3116        93.63       B+Tree index
ALLOCATED           209         6.28        Freshly allocated
INODE               1           0.03        File segment inode
IBUF_BITMAP         1           0.03        Insert buffer bitmap
FSP_HDR             1           0.03        File space header

Notes:
  • Tablespace size is 52MB,
  • Even though the data set was created using batch inserts, OPTIMIZE managed to reduce the storage footprint of our table.

Table metadata: 1M rows, MySQL 5.7.9 after OPTIMIZE

mysql@sandbox:$ ls -l
total 65556
-rw-r----- 1 mysql mysql       65 Nov  1 21:12 db.opt
-rw-r----- 1 mysql mysql     8608 Nov 11 21:55 table1.frm
-rw-r----- 1 mysql mysql 67108864 Nov 11 21:55 table1.ibd

mysql@sandbox:$ innodb_space -f table1.ibd space-page-type-summary
type                count       percent     description
INDEX               3116        76.07       B+Tree index
ALLOCATED           977         23.85       Freshly allocated
INODE               1           0.02        File segment inode
IBUF_BITMAP         1           0.02        Insert buffer bitmap
FSP_HDR             1           0.02        File space header

Notes:
  • Tablespace size is 64MB, which is 4MB more than before optimization,
  • The difference comes from the fact that OPTIMIZE resulted in additional 257 (720 before vs 977 after) free pages allocated for future use.

Table metadata: 4M rows, MySQL 5.6.27 after OPTIMIZE

mysql@sandbox:$ ls -l
total 204820
-rw-rw---- 1 mysql mysql        65 Nov  1 21:12 db.opt
-rw-rw---- 1 mysql mysql      8608 Nov 11 22:00 table1.frm
-rw-rw---- 1 mysql mysql 209715200 Nov 11 22:00 table1.ibd

mysql@sandbox:$ innodb_space -f table1.ibd space-page-type-summary
type                count       percent     description
INDEX               12447       97.24       B+Tree index
ALLOCATED           350         2.73        Freshly allocated
INODE               1           0.01        File segment inode
IBUF_BITMAP         1           0.01        Insert buffer bitmap
FSP_HDR             1           0.01        File space header

Notes:
  • Tablespace size is 200MB,
  • The number of pages allocated for future use is relatively low. Note that above certain threshold MySQL allocates space in whole extents (not individual pages) so having hundreds of free pages is all by design.

Table metadata: 4M rows, MySQL 5.7.9 after OPTIMIZE

mysql@sandbox:$ ls -l
total 237588
-rw-r----- 1 mysql mysql        65 Nov  1 21:12 db.opt
-rw-r----- 1 mysql mysql      8608 Nov 11 21:59 table1.frm
-rw-r----- 1 mysql mysql 243269632 Nov 11 22:00 table1.ibd

mysql@sandbox:$ innodb_space -f table1.ibd space-page-type-summary
type                count       percent     description
INDEX               12446       83.82       B+Tree index
ALLOCATED           2399        16.16       Freshly allocated
INODE               1           0.01        File segment inode
IBUF_BITMAP         1           0.01        Insert buffer bitmap
FSP_HDR             1           0.01        File space header

Notes:
  • Tablespace size is 232MB,
  • The number of pages actually used for storing data ("B+Tree index" above) is pretty much the same as in MySQL 5.6,
  • We again have tons of free pages allocated for future use. This time our table in 5.7 is 32MB (~16%) bigger than in 5.6.

Table metadata: 16M rows, MySQL 5.6.27 after OPTIMIZE

mysql@sandbox:$ ls -l
total 802836
-rw-rw---- 1 mysql mysql        65 Nov  1 21:12 db.opt
-rw-rw---- 1 mysql mysql      8608 Nov 11 22:06 table1.frm
-rw-rw---- 1 mysql mysql 822083584 Nov 11 22:08 table1.ibd

mysql@sandbox:$ innodb_space -f table1.ibd space-page-type-summary
type                count       percent     description
INDEX               49769       99.19       B+Tree index
ALLOCATED           398         0.79        Freshly allocated
IBUF_BITMAP         4           0.01        Insert buffer bitmap
XDES                3           0.01        Extent descriptor
INODE               1           0.00        File segment inode
FSP_HDR             1           0.00        File space header

Notes:
  • Tablespace size is 784MB,
  • MySQL is keeping the number of free pages in check.

Table metadata: 16M rows, MySQL 5.7.9 after OPTIMIZE

mysql@sandbox:$ ls -l
total 929812
-rw-r----- 1 mysql mysql        65 Nov  1 21:12 db.opt
-rw-r----- 1 mysql mysql      8608 Nov 11 22:06 table1.frm
-rw-r----- 1 mysql mysql 952107008 Nov 11 22:07 table1.ibd

mysql@sandbox:$ innodb_space -f table1.ibd space-page-type-summary
type                count       percent     description
INDEX               49771       85.65       B+Tree index
ALLOCATED           8332        14.34       Freshly allocated
IBUF_BITMAP         4           0.01        Insert buffer bitmap
XDES                3           0.01        Extent descriptor
INODE               1           0.00        File segment inode
FSP_HDR             1           0.00        File space header

Notes:
  • Tablespace size is 908MB (+124MB compared to 5.6),
  • Again the entire "overhead" comes from freshly allocated pages.

Discussion of results

As stated before, online DDL comes at a certain price. InnoDB allocates extra free pages, which are required if we want to handle concurrent DML operations against the table that is being altered.

While this is not a problem for active tables (pages will be used for new/modified rows), the negative effects of online DDL may be noticeable for tables that are not expected to receive any DML activity. If you're optimizing an older, unused table to reduce its storage footprint, you most probably don't want to waste space for fresh pages.

Workarounds

If your goal is to reduce table size to absolute minimum and you do not need the benefits of online DDL, you can ALTER the table manually using the old COPY algorithm. Let's see how this changes the size of our table (still with 16M rows):

alter table repro.table1 force, algorithm=copy;

This gives us very good results, consistent with what MySQL 5.6 does:
mysql@sandbox:$ ls -l
total 819220
-rw-r----- 1 mysql mysql        65 Nov  1 21:12 db.opt
-rw-r----- 1 mysql mysql      8608 Nov 11 22:09 table1.frm
-rw-r----- 1 mysql mysql 838860800 Nov 11 22:11 table1.ibd

mysql@sandbox:$ innodb_space -f table1.ibd space-page-type-summary
type                count       percent     description
INDEX               49777       97.22       B+Tree index
ALLOCATED           1414        2.76        Freshly allocated
IBUF_BITMAP         4           0.01        Insert buffer bitmap
XDES                3           0.01        Extent descriptor
INODE               1           0.00        File segment inode
FSP_HDR             1           0.00        File space header


Tablespace size is now 800MB, still 16MB more than in 5.6 but 108MB less than after OPTIMIZE we ran previously. The number of free pages in 5.7 is still slightly larger but again, it's nothing like what we saw before.

Closing notes

I'm a big fan of online DDL in InnoDB and there's no doubt that online OPTIMIZE is a cool feature to have. That being said, I believe MySQL documentation could have done a better job explaining the feature. Hopefully the MySQL team will share my view on this.

No comments:

Post a Comment