Simon Says: Investigating InnoDB FULLTEXT cache performance issues in MySQL <5.6.30

I'm not a huge fan of the InnoDB FULLTEXT feature and I admit I wasn't too keen to play with it in the past. Apparently, the feeling is mutual and so FULLTEXT issues haven't popped up in too many projects I worked on... until last week.

This post describes the troubleshooting process of a FULLTEXT cache performance issue. Quite inconspicuous at the beginning, it proved to be a lot of fun in the end.

Ready, Set, GDB!

Background

It all started with a generic performance issue report unrelated to FULLTEXT indexing. The issue, as described in the report, was:

The server performs acceptably for a few days.
After a certain amount of time, DML performance suddenly drops and remains low until the server is rebooted.
Reboot restores original performance, which again only lasts for a few days.

The situation was resulting in DML latency degradation during normal OLTP processing but more importantly, it completely destroyed the performance of regular data import processes (schema provisioning from a SQL dump). An import that would ordinarily take 2 minutes slowed down to >30 minutes.

Troubleshooting

The server didn't have any existing diagnostic information on it so I went to gather some basic data during the next performance degradation cycle. What we got was:

Not much from GLOBAL STATUS. No obvious workload spikes, no significant traffic changes during the slow periods.
A bit more from PROCESSLIST. I observed DMLs waiting for dozens or hundreds of seconds against a single table. One table. That's a hint.
Even more from transaction metadata (INFORMATION_SCHEMA.TRX). I again observed DMLs waiting but I also noticed internal MySQL threads in "doing SYNC commit" or "doing SYNC index" state showing up quite regularly. Another hint right there.

Following the first hint, I determined that the table in question had a FULLTEXT index on it. I also checked that customer had over a hundred schemas with over a thousand FULLTEXT-enabled tables.

Following the second hint, I went through the code and found out that "SYNC commit" and "SYNC index" states are part of internal FULLTEXT processing:

(MySQL 5.6.27, at storage/innobase/fts/fts0fts.cc:4348)

/*********************************************************************//**
Commit the SYNC, change state of processed doc ids etc.
@return DB_SUCCESS if all OK */
static  __attribute__((nonnull, warn_unused_result))
dberr_t
fts_sync_commit(
/*============*/
 fts_sync_t* sync)   /*!< in: sync state */
{
 dberr_t  error;
 trx_t*  trx = sync->trx;
 fts_cache_t* cache = sync->table->fts->cache;
 doc_id_t last_doc_id;

 trx->op_info = "doing SYNC commit";

(MySQL 5.6.27, at storage/innobase/fts/fts0fts.cc:4309)

/*********************************************************************//**
Run SYNC on the table, i.e., write out data from the index specific
cache to the FTS aux INDEX table and FTS aux doc id stats table.
@return DB_SUCCESS if all OK */
static __attribute__((nonnull, warn_unused_result))
dberr_t
fts_sync_index(
/*===========*/
 fts_sync_t*  sync,  /*!< in: sync state */
 fts_index_cache_t* index_cache) /*!< in: index cache */
{
 trx_t*  trx = sync->trx;
 dberr_t  error = DB_SUCCESS;

 trx->op_info = "doing SYNC index";

So it's FULLTEXT alright... but what's wrong with it?

At that point I only had very basic understanding of how the FTS caching works:

The cache can grow up to "innodb_ft_cache_size" bytes per table before it has to sync/merge to disk. The default per-table cache limit is 8 million bytes.
There's also a global limit of "innodb_ft_total_cache_size" bytes. By default it's 640 million bytes.

There are existing MySQL bug reports (such as https://bugs.mysql.com/bug.php?id=73816) that describe possible situations where DML statements against FULLTEXT-enabled tables may stall due to locking and stuff. That said, the situation here was a little different in that initial FULLTEXT performance was very good and it would only degrade after days of normal processing.

Further investigation into FTS cache behavior

After skipping through FTS cache management code, I figured that the FTS cache is synced in two situations: when a table limit is hit ("innodb_ft_cache_size") or when the global limit is hit ("innodb_ft_total_cache_size").

In either case, syncing basiscally merges cached data onto stable storage, then clears the cache. What What's interesting is that InnoDB only synchronizes the cache for the particular table being modified when the limit is hit. In other words, if you already have tables "t1" through "t50" in the cache and you hit a limit while processing table "t51", only the cache for "t51" will be synced and cleaned. Now that's a good lead!

Using all the hints gathered so far, I arrived at the following hypothesis:

When FTS cache is fresh an empty, flushing happens only due to table limits. In other words, cache is synced every 8 million bytes, which is relatively infrequent and fast.
A FTS-enabled table develops a tail of unsychronized changes for each table present in the cache. For each table, the tail is smaller than 8 million bytes so syncing does not occur and the unsychronized changes remain in the cache.
With a large enough number of FTS-enabled tables in the system, the unsynchronized tails eventually add up to a total size slightly less than the global cache limit. In other words, the cache is left with the last, small portion of free memory i.e. "last_portion_size = innodb_ft_total_cache_size - sum_of_unsynchronized_tail_sizes".
Once that happens, syncing no longer occurs due to table limits but due to the global limit. Cache must be synced every "last_portion_size" bytes instead of every "innodb_ft_cache_size" bytes.

Got repro?

With default configuration, "innodb_ft_total_cache_size" divided by "innodb_ft_cache_size" gives you 80 cache "slots" so the theory is that you need more than 80 tables before you start seeing negative effects of the tails adding up.

I was able to reproduce the issue by inserting around 100MB of data into N+M InnoDB tables with FULLTEXT indexes, where N > 80.

Anyhow, what I observed was:

Inserts worked consistently fast for the first N tables.
Insert into subsequent M tables were painfully slow.

I pulled call stacks showing that the slowness is due to FTS cache syncing but obviously that wasn't a good enough explanation and I wanted to verify if the original hypothesis was correct. To that end, I re-ran the test and used GDB to check a few things around FTS code execution. I was looking to answer the following questions:

How often does cache syncing occur?
Why does it occur? Is it because of the table cache limit or the global cache limit?
What's the size of table's FTS cache when syncing occurs?
Is it true that the cache holds unsynced tails of FTS index changes? This would verify the theory about the tails adding up and leading to the "last_portion_size" problem.

FTS cache sizes and syncing frequency during fast period

The first stage of the test was executed during "fast period" i.e. before the issue occurred. This was to analyze system behavior in healthy state.

(gdb) b /home/mysql/source/mysql-5.6.27/storage/innobase/fts/fts0fts.cc:3563
Breakpoint 1, fts_add_doc_by_id (doc_id=doc_id@entry=117456, fts_indexes=..., ftt=...)

(gdb) c
Continuing.

[Switching to Thread ...]
Breakpoint 1, fts_add_doc_by_id (doc_id=doc_id@entry=117456, fts_indexes=..., ftt=...)
    at /home/mysql/source/mysql-5.6.27/storage/innobase/fts/fts0fts.cc:3563
3563    if (cache->total_size > fts_max_cache_size
3564        || fts_need_sync) {
3565     fts_sync(cache->sync);
(gdb) p cache->total_size
$1 = 8000209
(gdb) p fts_max_cache_size
$2 = 8000000
(gdb) p fts_need_sync
$3 = false

Here's what we know:

In this case, the cache size for the particular table ("cache->total_size") was 209 bytes above the limit ("fts_max_cache_size"), which triggered a sync.
The flag "fts_need_sync" is "false", which means that the global limit was not exceeded. If you want to know more, search the code for "fts_is_sync_needed".

I therefore concluded that the sync occurred due to table limit, not the global limit. That is good news: it means we're syncing every 8 million bytes, which (in this particular test) lets us sync "lazily" once every few thousand documents.

I could double-check that by looking at the "doc_id" argument in "fts_add_doc_by_id" function referenced in the breakpoint above. Each time the breakpoint hit, my "doc_id" increased by around 2000:

(gdb) c
Continuing.

Breakpoint 1, fts_add_doc_by_id (doc_id=doc_id@entry=117456, fts_indexes=..., ftt=)

(gdb) c
Continuing.

Breakpoint 1, fts_add_doc_by_id (doc_id=doc_id@entry=119412, fts_indexes=..., ftt=)

(gdb) c
Continuing.

Breakpoint 1, fts_add_doc_by_id (doc_id=doc_id@entry=121370, fts_indexes=..., ftt=)

FTS cache sizes and syncing frequency during slow period

The second stage of the test was executed during "slow period" i.e. when the issue was present.

Breakpoint 1, fts_add_doc_by_id (doc_id=doc_id@entry=185668, fts_indexes=..., ftt=)
    at /home/mysql/source/mysql-5.6.27/storage/innobase/fts/fts0fts.cc:3563
3563    if (cache->total_size > fts_max_cache_size
3564        || fts_need_sync) {
3565     fts_sync(cache->sync);
(gdb) p cache->total_size
$1 = 409
(gdb) p fts_max_cache_size
$2 = 8000000
(gdb) p fts_need_sync
$3 = true

Now that's completely different! The "fts_need_sync" flag is "true", which indicates we're running into the global cache limit. As a result, InnoDB decides to sync even though the table cache size is just 409 bytes, way below the 8-million-byte limit.

Due to the cache hitting the global limit, individual table's cache started getting synced for every single document instead of every ~2000 documents as observed previously:

(gdb) c
Continuing.

Breakpoint 1, fts_add_doc_by_id (doc_id=doc_id@entry=185668, fts_indexes=..., ftt=)

(gdb) c
Continuing.

Breakpoint 1, fts_add_doc_by_id (doc_id=doc_id@entry=185669, fts_indexes=..., ftt=)

(gdb) c
Continuing.

Breakpoint 1, fts_add_doc_by_id (doc_id=doc_id@entry=185670, fts_indexes=..., ftt=)

Not good at all!

At this point I was only curious to see if the unsynchronized tails are indeed the reason why we're running into the global cache limit. Fortunately, this was easy to check: reading the code, I learned that "fts_is_sync_needed" function is the one that decides the value of "fts_need_sync" flag. The function works by iterating over all table caches and adding up their respective sizes. If the total size exceeds the global limit, syncing will be triggered.

I therefore set a breakpoint on the function and checked table cache sizes as the function was iterating over them:

(gdb) b fts_is_sync_needed(ib_vector_t const*)
Breakpoint 1 at 0xd8cd70: file /home/mysql/source/mysql-5.6.27/storage/innobase/fts/fts0opt.cc, line 2855.
(gdb) c
Continuing.
[Switching to Thread ...]

Breakpoint 1, fts_is_sync_needed (tables=0x205d86e8) at /home/mysql/source/mysql-5.6.27/storage/innobase/fts/fts0opt.cc:2855
2855  ulint  total_memory = 0;
(gdb) n
2856  double  time_diff = difftime(ut_time(), last_check_sync_time);
(gdb)
2858  if (fts_need_sync || time_diff < 5) {
(gdb)
2862  last_check_sync_time = ut_time();
(gdb)
2864  for (ulint i = 0; i < ib_vector_size(tables); ++i) {
(gdb)
2868    ib_vector_get_const(tables, i));
(gdb)
2870   if (slot->state != FTS_STATE_EMPTY && slot->table
(gdb)
2872    total_memory += slot->table->fts->cache->total_size;
(gdb) p slot->table->name
$11 = 0x212c6570 "test_1/t"
(gdb) p slot->table->fts->cache->total_size
$12 = 6333437
(gdb) n
2875   if (total_memory > fts_max_total_cache_size) {
(gdb)
2864  for (ulint i = 0; i < ib_vector_size(tables); ++i) {
(gdb)
2868    ib_vector_get_const(tables, i));
(gdb)
2870   if (slot->state != FTS_STATE_EMPTY && slot->table
(gdb)
2872    total_memory += slot->table->fts->cache->total_size;
(gdb) p slot->table->name
$13 = 0x210dbf40 "test_2/t"
(gdb) p slot->table->fts->cache->total_size
$14 = 2231864
(gdb) s
2875   if (total_memory > fts_max_total_cache_size) {
(gdb) p total_memory
$15 = 8565301

Comments:

Note that tables "test_1/t" and "test_2/t" are the first two tables I imported during my test. They were no longer under load when I pulled this output.
The unsynced "tail" for table "test_1/t" is 6333437 bytes.
The unsynced "tail" for table "test_2/t" is 2231864 bytes.
Both tails add up to 8565301 bytes.

Neat! Well, not really, but it does confirm the original suspicion. The unsynchronized tails do stay in the cache and they add up to eventually hit the global cache limit. Once that happens, InnoDB starts synchronizing furiously and the performance drops dramatically.

This concluded the troubleshooting exercise.

Solutions?

Just so you know, I could not reproduce this issue in the recent MySQL versions. The patch for Bug #73816 seems to have been effective, which is good news! If you're here because you ran into this issue yourself, make sure your'e running MySQL 5.6.30+ or 5.7.12+.

Now, even though the issue did no longer reproduce in MySQL 5.6.30+, I decided to share the troubleshooting experience as I found it to be quite interesting. Hope you liked it!

What if I can't upgrade?

You really should. If you can't, there are some tricks you can use to avoid the problem:

Reduce the table cache limit to make more "slots" in the cache. Decreasing "innodb_ft_cache_size" from 8 million bytes to 1.6 million bytes gives you 5x the amount of slots with minor performance degradation (6-8% in my particular synthetic test).
Increase the global cache limit, which again can help you create more cache "slots".
Set "innodb_optimize_fulltext_only" parameter to "1" and OPTIMIZE your FULLTEXT-enabled tables. This will optimize FULLTEXT indexes and will also synchronize and clear the cache. Now, I'm only mentioning this option because it does technically work, though it's probably too resource-intensive to be a viable option in production.

Simon Says

Labels

Friday, September 16, 2016

Investigating InnoDB FULLTEXT cache performance issues in MySQL <5.6.30