Cache Initialization
Initialization starts with an instance of Store
reading the storage
configuration file, by default storage.config
. For each valid element in
the file an instance of Span
is created. These are of basically
four types:
File
Directory
Disk
Raw device
After creating all the Span
instances, they are grouped by device
ID to internal linked lists attached to the Store::disk
array [1]. Spans that refer to the same directory, disk, or raw
device are coalesced in to a single span. Spans that refer to the same file
with overlapping offsets are also coalesced [2]. This is all done in
ink_cache_init()
called during startup.
Note
The span logic is also used by the HostDB and more than one otherwise inexplicable feature is provided by the span logic for that module.
After configuration initialization, the cache processor is started by calling
CacheProcessor::start()
. This does a number of things:
For each valid span, an instance of CacheDisk
is created. This
class is a continuation and so can be used to perform potentially
blocking operations on the span. The primary use of these is to be passed to
the AIO threads as the callback when an I/O operation completes. These are then
dispatched to AIO threads to perform storage unit initialization. After
all of those have completed, the resulting storage is distributed across the
volumes in cplist_reconfigure()
. The
CacheVol
instances are created at this time.
Stripe Assignment
Every object that is stored in stored in a single, specific stripe. Stripe assignment is determining for an object what stripe that is. The logic described here sets up the stripe assignment table maps from the cache key to a specific stripe.
Cache stripe assignment setup is done once all stripes have initialized (that
is, all stripe header read operations have completed). There is an instance of
CacheHostRecord
for each line in hosting.config
and one generic instance
(Cache::hosttable
) for all cache volumes that are not explicitly assigned. If
hosting.config
is empty then all cache volumes will be in the generic record.
build_vol_hash_table()
in iocore/cache/Cache.cc does the work of setting up the
stripe assignment and is called for each CacheHostRecord
and the generic host record. The
stripes to be assigned are in CacheHostRecord::vols
.
An indirect index mapping is created to account for stripes that are not available. The total size
of the stripes is computed at the same time. The forvol
and getvol
arrays are used
for debugging, they are not essential to the assignment setup. rtable_entries
is filled with
stripe size divided by VOL_HASH_ALLOC_SIZE
. These values are used to determine the number of
assignment slots given to each stripe. For each stripe a seed for a 32 bit pseudo random number
generator is created based on stripe properties. Another array of pairs of value and stripe index is
filled using these. For each VOL_HASH_ALLOC_SIZE
amount of space in a stripe, a pair is
generated containing the stripe index and the next random number from that stripe’s generator. This
array is then sorted in ascending order.
The result is sampled in sections, the size of the sections selected to yield
VOL_HASH_TABLE_SIZE
sections. For each section the sample value is the midpoint of the
section.For the example, the number of sections is set to 17 (because the number of sections should
be a prime number). This yields 17 sections each of width 15 with a sample value equal to 7 more
than the initial value. The results of applying this to the rtable
is
This process can be viewed as dividing a number line in to sampling sections, each section corresponding to a stripe assignment slot.
Each random value for a stripe in the rtable
array can be considered a node in this space.
For one stripe this might look like
The full array contains random value nodes for all the stripes.
The stripe for that section (assignment slot) is the first node at or past the sample value. This can be seen as the arrow color in the previous figure. This provides a reasonably proportioned to size assignment of slots to stripes. It is also a consistent hash in that if a stripe is removed, the recomputation will tend to distribute assignments for the missing stripe across the other stripes in proportion to their sizes while not changing the assignment of any slot not assigned to the missing stripe. In essence for each sample point (assignment slot) a permutation of the stripes is implied by the order of the random value nodes past that sample point. The randomization serves to spread re-assigned slots to various stripes instead of a single one.
If the stripe is restored the assignments will be the same as before the stripe was removed. The assignment is very sensitive to the properties of each stripe - changing a stripe size or location will effectively be as if it were a new stripe. In the figure the two blue stripe assignments are changed to purple and green respectively. If the blue stripe were added back those assignments and only those would revert to blue. This is because for each stripe the node sequence as generated by the pseudo random number generator depends only the properties of the stripes.
At runtime stripe selection is done by Cache::key_to_vol()
which selects the
CacheHostRecord
instance then picks the stripe assignment slot in the array which
determines the stripe for the object.
Footnotes