Monday, November 23, 2015

Memory consumption in heavily partitioned databases: MySQL 5.7 vs 5.6

MySQL introduces several improvements related to partitioning. Most importantly, work has been done to move InnoDB towards native partitioning support, which not only "paves the way for better overall partitioning" (source) in the future but already comes with measurable performance benefits.

This article focuses on memory usage in heavily partitioned InnoDB databases.


Over the last few weeks I've been (un)lucky enough to work with several MySQL databases suffering from stability issues due to extreme memory pressure. All databases belonged to the same DBA team, who tend to apply complex partitioning schemes even to very small tables. Discussion as to whether or not this approach is correct is beyond the scope of this article but the situation encouraged me to spend more quality time with MySQL 5.7 and see what level of improvement I can expect.

MySQL 5.7's "native partitioning" means, among other things, that InnoDB moves away from using a separate handler for every single partition it opens. Instead, there is a single handler entity (per table) that encapsulates all partitioning functionality. This can be visualized as follows:
As shown above, the pre-5.7 partitioning model involves two types of handlers:

  • The "ha_partition" handler acting as an overlay/proxy structure for partitioned tables. 
  • "ha_innobase" handlers for each individual table partition. This is where the read/write requests are forwarded by the proxy handler.
It goes without saying that such approach results in memory usage skyrocketing in heavily partitioned databases. In other words, memory is wasted on handler structures instead of being used where it can pay for itself (buffers, caches).

MySQL 5.7 encapsulates and consolidates partition handling into a single entity: the "ha_innopart". If all this makes no sense, let's try the following analogy:
  • MySQL 5.7: a single book with a table of contents and 128 chapters.
  • MySQL 5.6: a single book containing a table of contents plus 128 separate books (one per chapter).
Now let's assume that these are all hardcover books. You should already see what kind of overhead you're looking at.

The test

The data set consists of 500 tables with 128 partitions each (64K partitions total):
create table partitions.table_N (id int primary key auto_increment) partition by hash (id) partitions = 128;
Each table was populated with 256 rows (random integers).

The test itself involves selecting data from all partitions in all tables:
select sum(id) from partitions.table_1;
select sum(id) from partitions.table_2;
select sum(id) from partitions.table_N;
Other details about the test:
  • The test was executed in multi-threaded fashion: one statement per thread,
  • Each multi-threaded test was executed two times in a row,
  • Memory consumption measured as MySQL's resident set size (RSS) reported /proc/{pid}/stat,
  • Memory consumption checked twice per second while running tests. The final result is the maximum memory consumption recorded during both test runs.
Selecting from all tables in the same order in all threads may sound silly but according to my tests, this is actually close to the worst-case scenario. I also tested a scenario when tables have been selected from in a random order, which resulted in lower peak memory utilization.


Test with default table_open_cache (2K)

Test with table_open_cache = 100K


The improvements are obvious. New approach to partition handling in MySQL 5.7 results in much lower overall memory consumption in heavily partitioned databases.

No comments:

Post a Comment