INNODB LRU Cache

 ·  ☕ 4 

The LRU (Least Recently Used) list is a common data structure.

In simple terms, it arranges each piece of data in a sequence, placing the most recently used data at the front and the least used data at the back.typically implemented using a linked list.

The LRU list has a wide range of applications, and its main purpose is to reduce the operational cost of data retrieval. The specific approach is to place data retrieved from the disk into a cache, where the cached data is ordered in an LRU manner, with frequently used data at the front for easy access.

example

Ranking list(in the game or music chart e.t.c) can apply the LRU concept. Ranked that were popular a few hours ago may still be popular in the next hour, meaning those are likely to be repeatedly accessed and displayed on the page.
By retaining these commonly used words in the cache, the program can read them multiple times easily, reducing the cost of data retrieval.

LRU in InnoDB

InnoDB is MySQL’s default storage engine, and it incorporates many software design concepts worth learning. These designs are detailed and can inspire us to think more deeply and create better software.

As following as Below, I will share how InnoDB how to use LRU list optimizes reduce the operational cost of data retrieval.

LRU in Classic

In InnoDB, the smallest unit of data access is called a page. Each node in the linked list stores a page. The cache memory space is called the Buffer Pool, as shown below.

Traditional LRU algorithms define a maximum number of nodes in the linked list. When a data page is read, it is considered most used and inserted at the front of the list.

If the node count exceeds the limit, the last node is evicted. If an existing data page in the list is read, it is removed from its original position and inserted at the front, keeping the front nodes as recently used.

Data pages are likely to be read only once and not accessed again when a full table scan or index scan is triggered.
these one-time read pages would flood the front part of the linked list, being mistakenly identified as most used.

cache space is occupied by these infrequently used pages, pushing out the genuinely frequently used pages, which then have to be read from the disk again, thereby defeating the purpose of caching.

LRU Optimized

To address this, InnoDB optimized the LRU algorithm, adjusting the definition of most used based on two key points:

  1. Frequency Recognition
    Recently used pages, when read from the disk, are not inserted directly at the front of the list but in the middle.
    Only when they are read again from the cache are they considered most used and moved to the front. This insertion method is called the Midpoint Insertion Strategy.

    A real-life example:
    something you want to use is not placed directly on your desk because you might just a whim through it out of interest, not intending to use it thoroughly.

    The stuff, after being used from the cabinet, is placed in the middle of the desk (middle of the list). Only when you pick it up again to use is it moved to the front of the desk (front of the list).

    as shown above, the New Sub List and the Old Sub List, with a boundary at approximately the last 3/8 of the list. Data read from the disk is first placed at the head of the Old Sub List (3/8 position). When read again in the Buffer Pool, it is inserted at the head of the New Sub List.
    When nodes exceed the limit, the last node in the Old Sub List is evicted. The parameter innodb_old_blocks_pct sets the proportion of the Old Sub List, with a default value of 37% in MySQL 8.0, roughly 3/8.

  2. Time Recognition
    InnoDB provides the innodb_old_blocks_time parameter, setting the minimum time a page must stay in the Old Sub List before moving to the New Sub List.

    This parameter is in milliseconds, with a default value of 1000 in MySQL 8.0 witch prevents frequent list refreshing.

Through these two point as mentioned above, InnoDB ensures that truly necessary cached content is placed at the front, effectively enhancing read performance.

Summary

The hit rate is a critical consideration when designing caches.

InnoDB’s design philosophy, we see that it improves the LRU algorithm by adjusting frequency and time, redefining what most used data means, thereby enhancing the cache hit rate.

ref: MySQL 8.0 Reference Manual


Meng Ze Li
Meng Ze Li
Kubernetes / DevOps / Backend