ARC: A Self-Tuning, Low Overhead Replacement Cache

2018-10-28

https://www.youtube.com/watch?v=F8sZRBdmqc0

Bryan Cantrill

2017-10-17

“dynamically, adaptively, and continually balances between the recency and frequency components in an online and self-tuning fashion”
paper starts by quoting from Burks, Goldstine, von Neumann, Preliminary Discussion of the Logical Design of Electronic Computing Instrument
Trying to balance a high hit ratio that is space and time efficient. Given the sloppiness and non-uniformity of real life workloads, no static a priori fixed replacement policy will work well.
Review of prior work
- MIN, the optimal policy: “suppose we could predict the future and choose to cache what will be needed”
- LRU: evict least recently used data
  - degenerates in the case of big scans that evict hot data
- LFU: evict least frequently used data
  - requires logairthmic implementation complexity in cache size
  - opposite problem from LRU: over-rewards frequency, which ignores changes in usage pattern
    - tangent: a database with its own caching running on ZFS
    - you’re only getting ARC access on its misses
- LRU2
  - tracks two most recent references per page, and when eviction is required, evict the one with the oldest penultimate reference
- LRFU: Least Recently/Frequently Used
  - performs an exponential decay over time of some initial value
  - bad news, because the kernel is not the place for exponentiation (expensive) or floating point math/values in general
- Ghost caches
  - in addition to actually caching data, cache metadata about what’s been evicted. You can use that to improve your future performance.
Fixed replacement cache
- Tunable value p, 0 <= p <= c with c = the storage you plan to use
- Keep p pages in list T1 (pages that have been seen at least once recently), and c-p pages in T2 (pages that have been seen at least twice recently). Pages can be expired from T2 to T1 on an LRU basis.
ARC is like FRC but tunes p dynamically
- 4 LRU lists: T1 MRU list; T2 MFU list; B1 MRU ghost list; B2 MFU ghost list
- case 1: page is in T1 or T2 (cache hit!). move it to MRU position in T2 (whether it was already in T2 or T1)
- case 2: page is in B1 (cache miss but in MRU ghost list). “If only you had more MRU”, so increase p by 1 even if more MRU than MFU (if |B1| > |B2|), or more if MFU is bigger than MRU (by |B2|/|B1| if |B2| > |B1|)
- case 3: like case 2, but in reverse
- replace(page, p): if MRU list is over target size or page is in MFU ghost list and MRU list is at target size, then delete from MRU list and move to MRU position in MRU ghost list; otherwise, delete from MFU list and move to MFU ghost list
- case 4: not in any of our lists. do a bunch of bookkeeping to ensure that ARC has space for this entry then put it into the MRU position of T1.
Compared ARC against FRC (where FRC had run the benchmarked workload offline in order to calculate the optimal p), and found that ARC performs as well as (and sometimes better) than FRC for any single parameter p. This implies that ARC’s dynamic-adjustment of p makes it better suited for workloads that change over time.
ARC in ZFS
- significant work required to make it work for variable-sized buffers
- dynamically increasing the size c of the ARC