MySQL introduces several improvements related to partitioning. Most importantly, work has been done to move InnoDB towards native partitioning support, which not only "paves the way for better overall partitioning" (source) in the future but already comes with measurable performance benefits.
This article focuses on memory usage in heavily partitioned InnoDB databases.
Background
Over the last few weeks I've been (un)lucky enough to work with several MySQL databases suffering from stability issues due to extreme memory pressure. All databases belonged to the same DBA team, who tend to apply complex partitioning schemes even to very small tables. Discussion as to whether or not this approach is correct is beyond the scope of this article but the situation encouraged me to spend more quality time with MySQL 5.7 and see what level of improvement I can expect.
MySQL 5.7's "native partitioning" means, among other things, that InnoDB moves away from using a separate handler for every single partition it opens. Instead, there is a single handler entity (per table) that encapsulates all partitioning functionality. This can be visualized as follows:
As shown above, the pre-5.7 partitioning model involves two types of handlers:
As shown above, the pre-5.7 partitioning model involves two types of handlers:
- The "ha_partition" handler acting as an overlay/proxy structure for partitioned tables.
- "ha_innobase" handlers for each individual table partition. This is where the read/write requests are forwarded by the proxy handler.
MySQL 5.7 encapsulates and consolidates partition handling into a single entity: the "ha_innopart". If all this makes no sense, let's try the following analogy:
- MySQL 5.7: a single book with a table of contents and 128 chapters.
- MySQL 5.6: a single book containing a table of contents plus 128 separate books (one per chapter).
The test
The data set consists of 500 tables with 128 partitions each (64K partitions total):
create table partitions.table_N (id int primary key auto_increment) partition by hash (id) partitions = 128;
Each table was populated with 256 rows (random integers).
The test itself involves selecting data from all partitions in all tables:
The test itself involves selecting data from all partitions in all tables:
select sum(id) from partitions.table_1; select sum(id) from partitions.table_2; ... select sum(id) from partitions.table_N;Other details about the test:
- The test was executed in multi-threaded fashion: one statement per thread,
- Each multi-threaded test was executed two times in a row,
- Memory consumption measured as MySQL's resident set size (RSS) reported /proc/{pid}/stat,
- Memory consumption checked twice per second while running tests. The final result is the maximum memory consumption recorded during both test runs.
Results
Test with default table_open_cache (2K)
Test with table_open_cache = 100K
Summary
The improvements are obvious. New approach to partition handling in MySQL 5.7 results in much lower overall memory consumption in heavily partitioned databases.
No comments:
Post a Comment