Header Heap
Memory for HTTP header data is kept in header heaps.
Classes
-
class HdrHeapObjImpl
This is the abstract base class for objects allocated in a
HdrHeap
. This allows updating objects in a heap in a generic way, without having to locate all of the pointers to the objects.The type of an instance stored in a heap must be one of the following values.
-
enumerator HDR_HEAP_OBJ_EMPTY = 0
Used to mark invalid objects, ones not yet constructed or ones that have been destroyed.
-
enumerator HDR_HEAP_OBJ_RAW = 1
Some sort of raw object, I have no idea.
-
enumerator HDR_HEAP_OBJ_URL = 2
A URL object.
-
enumerator HDR_HEAP_OBJ_HTTP_HEADER = 3
The header for an HTTP request or response.
-
enumerator HDR_HEAP_OBJ_MIME_HEADER = 4
A MIME header, containing MIME style fields with names and values.
-
enumerator HDR_HEAP_OBJ_FIELD_BLOCK = 5
Who the heck knows?
-
enumerator HDR_HEAP_OBJ_EMPTY = 0
-
class HdrStrHeap
This is a variable sized class, therefore new instance must be created by
new_HdrStrHeap()
and deallocated by thedestroy
method.
-
HdrStrHeap *new_HdrStrHeap(int n)
Create and return a new instance of
HdrStrHeap
. If n is less thanHDR_STR_HEAP_DEFAULT_SIZE
it is increased to that value.If the allocated size is
HDR_STR_HEAP_DEFAULT_SIZE
(or smaller and upsized to that value) then the instance is allocated from a thread local pool viastrHeapAllocator
. If larger it is allocated from global memory viaats_malloc
.
-
class HdrHeap
This is a variable sized class and therefore new instances must be created by
new_HdrHeap()
and deallocated by thedestroy
method.HdrHeap
manages memory for heap objects directly and memory for strings via ancillary heaps (which are instances ofHdrStrHeap
). For the string heaps there is at most one writeable heap, and up toHDR_BUF_RONLY_HEAPS
read only heaps.All objects in the internal heap must be subclasses of
HdrHeapObjImpl
.-
void evacuate_from_str_heaps(HdrStrHeap *new_heap)
Copy all live strings from the heap objects in this to new_heap.
-
void coalesce_str_heaps(int incoming_size)
This garbage collects the string heaps in a half space style, by creating a new string space (string heap), copying all of the strings there, and then discarding the existing string heaps.
The total amount of live string space is calculated by
HdrHeap::required_space_for_evacuation()
and a new string heap is created of a size at least as large as the live string space plus incoming_size bytes.All of the live strings are moved to the new string heap by
HdrHeap::evacuate_from_str_heaps()
, the existing string heaps are deallocated, and the new string heap becomes the writeable string heap for the header heap. The end result is a single writeable string heap and no read only string heaps, with all live strings resident in that writeable string heap.
-
char *allocate_str(int bytes)
Allocate nbytes of space for a string in the writeable string heap. A pointer to the first byte is returned, or
nullptr
if the space could not be allocated.
-
HdrHeapObjImpl *allocate_obj(int nbytes, int type)
Allocate a type object that is nbytes in size in the heap and return a pointer to it, or
nullptr
if the object could not be allocated.nbytes must be at most
HDR_MAX_ALLOC_SIZE
.The members of
HdrHeapObjImpl
are initialized. Further initialization is the responsibility of the caller.type must be one of the values specified in
HdrHeapObjImpl
.
-
int marshal_length()
Compute and return the size of the buffer needed to serialize this.
-
int marshal(char *buffer, int length)
Serialize this to buffer of size length. It is required that length be at least the value returned by
HdrHeap::marshal_length()
.
-
void evacuate_from_str_heaps(HdrStrHeap *new_heap)
-
HdrHeap *new_HdrHeap(int n)
Create and return a new instance of
HdrHeap
. If n is less thanHdrHeap::DEFAULT_SIZE
it is increased to that value.If the allocated size is
HdrHeap::DEFAULT_SIZE
(or smaller and upsized to that value) then the instance is allocated from a thread local pool viahdrHeapAllocator
. If larger it is allocated from global memory viaats_malloc
.
Implementation
String Coalescence
String heaps do not maintain lists of internal free space. Strings that are released are left in
place, creating dead space in the heap. For this reason it can become necessary to do a garbage
collection operation on the writeable string heap in the header heap by calling
HdrHeap::coalesce_str_heaps()
. This is done when
The amount of dead space in the writable string heap exceeds
MAX_LOST_STR_SPACE
.An external string heap is being added and all current read only string heap slots are used.
The mechanism is simple in design - the size of the live string data in the current string heaps is
calculated and a new heap is allocated sufficient to contain all existing strings, with additional
space for new string data. Each heap object is required to provide a strings_length
method
which returns the size of the live string data for that object (recursively as needed). The strings
are copied to the new string heap, all of the previous string heaps are discarded, and the new heap
becomes the writable string heap for the header heap.
Each heap object is responsible for providing a move_strings
method which copies its strings
to a new string heap, passed as an argument. This is a source of pointer invalidation for other
parts of the core and the plugin API. For the latter, insulating from such string movement is the
point of the TSMLoc
type.
String Allocation
Storage for a string is allocated by HdrHeap::allocate_str()
. If the current amount of dead
space is too large, this is treated as an initial allocation failure. If there is no current
writeable string heap, one is created that is a least as large as the space requested and the size
of the previous writeable string heap. Space for the string is then allocated out of the writeable
string heap. If this fails due to lack of space the current writeable string heap is “demoted” to a
read only string heap and allocation retried (which will cause a new writeable string heap). If the
writeable string heap cannot be demoted due to lack of read only slots, the strings heaps are
coalesced with an additional size request of the requested string size. This will result in a single
writeable string heap and not read only heaps, the former containing all of the existing strings plus
sufficient space to allocate the new string.
Object Allocation
Objects are allocated on the header heap by HdrHeap::allocate_obj()
. Such objects must be one
of a compile time determined set of types [1]. This method first tries to allocate the object in
existing free space. If that doesn’t work then the allocator walks a list of HdrHeap
instances looking for space. If no space is found anywhere, a new HdrHeap
instance is
created with twice the space of the last HdrHeap
in the list and added to the list to
try.
Once space is found for the object, the base members of HdrHeapObjImpl
are initialized with
the object type and size, with the m_obj_flags set to 0.
Serialization
Because heaps store the HTTP request / response data, a header heap needs to be serialized to be put
in to the cache. For performance reasons, it is desirable to be able to unserialize the serialized
data in place, rather than copying it again. That is, the data is read from disk into a block of
memory and then that memory is converted to a live data structure. In this case the memory used by
the heap is owned by some other object and the header heap must not do any clean up. This is
signaled by the m_writeable flag. In an unserialized header heap this is set to false
and such
a header heap is not allowed to allocate any additional objects or strings - it is immutable.
The primary mechanism to do this is to use swizzling on the pointers in the structure. During
serialization pointers are converted to offsets and during unserialization these offsets are
converted back to pointers. To make this simpler, unserialized header heaps are marked read only so
that updating does not have to be supported. Additionally, HdrHeap
is a POD and therefore
has no virtual function table pointer to be stored or restored [2].
To serialize, first HdrHeap::marshal_length()
is called to get a buffer size. The
serialization buffer is created with sufficient space for the header heap and that space is passed
to HdrHeap::marshal()
to perform the actual serialization. The object heaps are serialized
followed by the string heaps. No coalescence is done, on the presumption that because the amount
of dead space is limited by coalescence (as needed) on every string creation.
When serializing strings, each object is responsible for swizzling its own pointers. Because the
object heaps have already been serialized and all of the header heap object types are also PODs,
these serialized objects can have the pointer swizzling method, marshal
, called directly
on them. This method is provided with a set of “translations” which indicate the base offset for
each range of object and string heap memory. The object marshalling can then compute the correct
offset to store for each live string pointer.
Inheriting Strings
The string heaps are designed to be reference counted so that they can be shared as read only objects between heaps. This enables copying heap objects between heaps less expensive as the strings pointers in them can be preserved in the new heap by sharing the string heaps in which those strings reside.
This can still be a bit complex as it is possible that the combined number of string heaps is more than the limit. In this case, the target header heap does string coalescence so that it is reduced to having a single writeable string heap with enough free space to hold all of the strings in the source header heap. As a result, it is required that all heap objects already be present in the target header heap before the strings are inherited. This means that the string coalescence will properly copy the strings of and update the strings pointers in the copied heap objects.
Footnotes.