BufferWriter

Synopsis

#include <ts/BufferWriterForward.h> // Custom formatter support only.
#include <ts/BufferWriter.h> // General usage.

Description

BufferWriter is intended to increase code reliability and reduce complexity in the common circumstance of generating formatted output strings in fixed buffers. Current usage is a mixture of snprintf and memcpy which provides a large scope for errors and verbose code to check for buffer overruns. The goal is to provide a wrapper over buffer size tracking to make such code simpler and less vulnerable to implementation error.

BufferWriter itself is an abstract class to describe the base interface to wrappers for various types of output buffers. As a common example, FixedBufferWriter is a subclass designed to wrap a fixed size buffer. FixedBufferWriter is constructed by passing it a buffer and a size, which it then tracks as data is written. Writing past the end of the buffer is clipped to prevent overruns.

Consider current code that looks like this.

char buff[1024];
char * ptr = buff;
size_t len = sizeof(buff);
//...
if (len > 0) {
  auto n = std::min(len, thing1_len);
  memcpy(ptr, thing1, n);
  len -= n;
}
if (len > 0) {
  auto n = std::min(len, thing2_len);
  memcpy(ptr, thing2, n);
  len -= n;
}
if (len > 0) {
  auto n = std::min(len, thing3_len);
  memcpy(ptr, thing3, n);
  len -= n;
}

This is changed to

char buff[1024];
ts::FixedBufferWriter bw(buff, sizeof(buff));
//...
bw.write(thing1, thing1_len);
bw.write(thing2, thing2_len);
bw.write(thing3, thing3_len);

The remaining length is updated every time and checked every time. A series of checks, calls to memcpy, and size updates become a simple series of calls to BufferWriter::write().

For other types of interaction, FixedBufferWriter provides access to the unused buffer via BufferWriter::auxBuffer() and BufferWriter::remaining(). This makes it possible to easily use snprintf, given that snprint returns the number of bytes written. BufferWriter::fill() is used to indicate how much of the unused buffer was used. Therefore something like (riffing off the previous example):

if (len > 0) {
   len -= snprintf(ptr, len, "format string", args...);
}

becomes:

bw.fill(snprintf(bw.auxBuffer(), bw.remaining(),
        "format string", args...));

By hiding the length tracking and checking, the result is a simple linear sequence of output chunks, making the logic much easier to follow.

Usage

The header files are divided in to two variants. include/tscore/BufferWriter.h provides the basic capabilities of buffer output control. include/tscore/BufferWriterFormat.h provides the basic formatted output mechanisms, primarily the implementation and ancillary classes for BWFSpec which is used to build formatters.

BufferWriter is an abstract base class, in the style of std::ostream. There are several subclasses for various use cases. When passing around this is the common type.

FixedBufferWriter writes to an externally provided buffer of a fixed length. The buffer must be provided to the constructor. This will generally be used in a function where the target buffer is external to the function or already exists.

LocalBufferWriter is a templated class whose template argument is the size of an internal buffer. This is useful when the buffer is local to a function and the results will be transferred from the buffer to other storage after the output is assembled. Rather than having code like:

char buff[1024];
ts::FixedBufferWriter bw(buff, sizeof(buff));

it can be written more compactly as:

ts::LocalBufferWriter<1024> bw;

In many cases, when using LocalBufferWriter this is the only place the size of the buffer needs to be specified and therefore can simply be a constant without the overhead of defining a size to maintain consistency. The choice between LocalBufferWriter and FixedBufferWriter comes down to the owner of the buffer - the former has its own buffer while the latter operates on a buffer owned by some other object. Therefore if the buffer is declared locally, use LocalBufferWriter and if the buffer is received from an external source (such as via a function parameter) use FixedBufferWriter.

Writing

The basic mechanism for writing to a BufferWriter is BufferWriter::write(). This is an overloaded method for a character (char), a buffer (void *, size_t) and a string view (std::string_view). Because there is a constructor for std::string_view that takes a const char* as a C string, passing a literal string works as expected.

There are also stream operators in the style of C++ stream I/O. The basic template is

template < typename T > ts::BufferWriter& operator << (ts::BufferWriter& w, T const& t);

The stream operators are provided as a convenience, the primary mechanism for formatted output is via overloading the bwformat() function. Except for a limited set of cases the stream operators are implemented by calling bwformat() with the Buffer Writer, the argument, and a default format specification.

Reading

Data in the buffer can be extracted using BufferWriter::data(). This and BufferWriter::size() return a pointer to the start of the buffer and the amount of data written to the buffer. This is effectively the same as BufferWriter::view() which returns a std::string_view which covers the output data. Calling BufferWriter::error() will indicate if more data than space available was written (i.e. the buffer would have been overrun). BufferWriter::extent() returns the amount of data written to the BufferWriter. This can be used in a two pass style with a null / size 0 buffer to determine the buffer size required for the full output.

Advanced

The BufferWriter::clip() and BufferWriter::extend() methods can be used to reserve space in the buffer. A common use case for this is to guarantee matching delimiters in output if buffer space is exhausted. BufferWriter::clip() can be used to temporarily reduce the buffer size by an amount large enough to hold the terminal delimiter. After writing the contained output, BufferWriter::extend() can be used to restore the capacity and then output the terminal delimiter.

Warning

Never call BufferWriter::extend() without previously calling BufferWriter::clip() and always pass the same argument value.

BufferWriter::remaining() returns the amount of buffer space not yet consumed.

BufferWriter::auxBuffer() returns a pointer to the first byte of the buffer not yet used. This is useful to do speculative output, or do bounded output in a manner similar to using BufferWriter::clip() and BufferWriter::extend(). A new BufferWriter instance can be constructed with

ts::FixedBufferWriter subw(w.auxBuffer(), w.remaining());

or as a convenience

ts::FixedBuffer subw{w.auxBuffer()};

Output can be written to subw. If successful, then w.fill(subw.size()) will add that output to the main buffer. Depending on the purpose, w.fill(subw.extent()) can be used - this will track the attempted output if sizing is important. Note that space for any terminal markers can be reserved by bumping down the size from BufferWriter::remaining(). Be careful of underrun as the argument is an unsigned type.

If there is an error then subw can be ignored and some suitable error output written to w instead. A common use case is to verify there is sufficient space in the buffer and create a “not enough space” message if not. E.g.

ts::FixedBufferWriter subw{w.auxWriter()};
this->write_some_output(subw);
if (!subw.error()) w.fill(subw.size());
else w.write("Insufficient space"sv);

Examples

For example, error prone code that looks like

char new_via_string[1024]; // 512-bytes for hostname+via string, 512-bytes for the debug info
char * via_string = new_via_string;
char * via_limit  = via_string + sizeof(new_via_string);

// ...

* via_string++ = ' ';
* via_string++ = '[';

// incoming_via can be max MAX_VIA_INDICES+1 long (i.e. around 25 or so)
if (s->txn_conf->insert_request_via_string > 2) { // Highest verbosity
   via_string += nstrcpy(via_string, incoming_via);
} else {
   memcpy(via_string, incoming_via + VIA_CLIENT, VIA_SERVER - VIA_CLIENT);
   via_string += VIA_SERVER - VIA_CLIENT;
}
*via_string++ = ']';

becomes

ts::LocalBufferWriter<1024> w; // 1K internal buffer.

// ...

w.write(" ["sv);
if (s->txn_conf->insert_request_via_string > 2) { // Highest verbosity
   w.write(incoming_via);
} else {
   w.write(std::string_view{incoming_via + VIA_CLIENT, VIA_SERVER - VIA_CLIENT});
}
w.write(']');

There will be no overrun on the memory buffer in w, in strong contrast to the original code. This can be done better, as

if (w.remaining() >= 3) {
   w.clip(1).write(" ["sv);
   if (s->txn_conf->insert_request_via_string > 2) { // Highest verbosity
      w.write(incoming_via);
   } else {
      w.write(std::string_view{incoming_via + VIA_CLIENT, VIA_SERVER - VIA_CLIENT});
   }
   w.extend(1).write(']');
}

This has the result that the terminal bracket will always be present which is very much appreciated by code that parses the resulting log file.

Formatted Output

The base BufferWriter was made to provide memory safety for formatted output. Support for formatted output was made to provide type safety. The implementation deduces the types of the arguments to be formatted and handles them in a type specific and safe way.

The formatting style is of the “prefix” or “printf” style - the format is specified first and then all the arguments. This contrasts to the “infix” or “streaming” style where formatting, literals, and argument are intermixed in the order of output. There are various arguments for both styles but conversations within the Traffic Server community indicated a clear preference for the prefix style. Therefore formatted out consists of a format string, containing formats, which are replaced during output with the values of arguments to the print function.

The primary use case for formatting is formatted output to fixed buffers. This is by far the dominant style of output in Traffic Server and during the design phase I was told any performance loss must be minimal. While work has and will be done to extend BufferWriter to operate on non-fixed buffers, such use is secondary to operating directly on memory.

Important

The overriding design goal is to provide the type specific formatting and flexibility of C++ stream operators with the performance of snprintf and memcpy.

This will preserve the general style of output in Traffic Server while still reaping the benefits of type safe formatting with little to no performance cost.

Type safe formatting has two major benefits -

  • No mismatch between the format specifier and the argument. Although some modern compilers do better at catching this at run time, there is still risk (especially with non-constant format strings) and divergence between operating systems such that there is no universally correct choice. In addition the number of arguments can be verified to be correct which is often useful.

  • Formatting can be customized per type or even per partial type (e.g. T* for generic T). This enables embedding common formatting work in the format system once, rather than duplicating it in many places (e.g. converting enum values to names). This makes it easier for developers to make useful error messages. See this example for more detail.

As a result of these benefits there has been other work on similar projects, to replace printf a better mechanism. Unfortunately most of these are rather project specific and don’t suit the use case in Traffic Server. The two best options, Boost.Format and fmt, while good, are also not quite close enough to outweigh the benefits of a version specifically tuned for Traffic Server. Boost.Format is not acceptable because of the Boost footprint. fmt has the problem of depending on C++ stream operators and therefore not having the required level of performance or memory characteristics. Its main benefit, of reusing stream operators, doesn’t apply to Traffic Server because of the nigh non-existence of such operators. The possibility of using C++ stream operators was investigated but changing those to use pre-existing buffers not allocated internally was very difficult, judged worse than building a relatively simple implementation from scratch. The actual core implementation of formatted output for BufferWriter is not very large - most of the overall work will be writing formatters, work which would need to be done in any case but in contrast to current practice, only done once.

BufferWriter supports formatting output in a style similar to Python formatting via BufferWriter::print(). Looking at the other versions of work in this area, almost all of them have gone with this style. Boost.Format also takes basically this same approach, just using different paired delimiters. Traffic Server contains increasing amounts of native Python code which means many Traffic Server developers will already be familiar (or should become familiar) with this style of formatting. While not exactly the same at the Python version, BWF (BufferWriter Formatting) tries to be as similar as language and internal needs allow.

As noted previously and in the Python and even printf way, a format string consists of literal text in which formats are embedded. Each format marks a place where formatted data of an argument will be placed, along with argument specific formatting. The format is divided in to three parts, separated by colons.

While this seems a bit complex, all of it is optional. If default output is acceptable, then BWF will work with just the format {}. In a sense, {} serves the same function for output as auto does for programming - the compiler knows the type, it should be able to do something reasonable without the programmer needing to be explicit.

format    ::=  "{" [name] [":" [specifier] [":" extension]] "}"
name      ::=  index | ICHAR+
index     ::=  non-negative integer
extension ::=  ICHAR*
ICHAR     ::=  a printable ASCII character except for '{', '}', ':'
name

The name of the argument to use. This can be a non-negative integer in which case it is the zero based index of the argument to the method call. E.g. {0} means the first argument and {2} is the third argument after the format.

bw.print("{0} {1}", 'a', 'b') => a b

bw.print("{1} {0}", 'a', 'b') => b a

The name can be omitted in which case it is treated as an index in parallel to the position in the format string. Only the position in the format string matters, not what names other format elements may have used.

bw.print("{0} {2} {}", 'a', 'b', 'c') => a c c

bw.print("{0} {2} {2}", 'a', 'b', 'c') => a c c

Note that an argument can be printed more than once if the name is used more than once.

bw.print("{0} {} {0}", 'a', 'b') => a b a

bw.print("{0} {1} {0}", 'a', 'b') => a b a

Alphanumeric names refer to values in a global table. These will be described in more detail someday. Such names, however, do not count in terms of default argument indexing.

specifier

Basic formatting control.

specifier ::=  [[fill]align][sign]["#"]["0"][[min][.precision][,max][type]]
fill      ::=  fill-char | URI-char
URI-char  ::=  "%" hex-digit hex-digit
fill-char ::=  printable character except "{", "}", ":", "%"
align     ::=  "<" | ">" | "=" | "^"
sign      ::=  "+" | "-" | " "
min       ::=  non-negative integer
precision ::=  positive integer
max       ::=  non-negative integer
type      ::=  type: "g" | "s" | "S" | "x" | "X" | "d" | "o" | "b" | "B" | "p" | "P"
hex-digit ::=  "0" .. "9" | "a" .. "f" | "A" .. "F"

The output is placed in a field that is at least min wide and no more than max wide. If the output is less than min then

  • The fill character is used for the extra space required. This can be an explicit character or a URI encoded one (to allow otherwise reserved characters).

  • The output is shifted according to the align.

    <

    Align to the left, fill to the right.

    >

    Align to the right, fill to the left.

    ^

    Align in the middle, fill to left and right.

    =

    Numerically align, putting the fill between the sign character and the value.

The output is clipped by max width characters and by the end of the buffer. precision is used by floating point values to specify the number of places of precision.

type is used to indicate type specific formatting. For integers it indicates the output radix and if # is present the radix is prefix is generated (one of 0xb, 0, 0x). Format types of the same letter are equivalent, varying only in the character case used for output. Most commonly ‘x’ prints values in lower cased hexadecimal (0x1337beef) while ‘X’ prints in upper case hexadecimal (0X1337BEEF). Note there is no upper case decimal or octal type because case is irrelevant for those.

g

generic, default.

b

binary

B

Binary

d

decimal

o

octal

x

hexadecimal

X

Hexadecimal

p

pointer (hexadecimal address)

P

Pointer (Hexadecimal address)

s

string

S

String (upper case)

For several specializations the hexadecimal format is taken to indicate printing the value as if it were a hexidecimal value, in effect providing a hex dump of the value. This is the case for std::string_view and therefore a hex dump of an object can be done by creating a std::string_view covering the data and then printing it with {:x}.

The string type (‘s’ or ‘S’) is generally used to cause alphanumeric output for a value that would normally use numeric output. For instance, a bool is normally 0 or 1. Using the type ‘s’ yields true` or ``false. The upper case form, ‘S’, applies only in these cases where the formatter generates the text, it does not apply to normally text based values unless specifically noted.

extension

Text (excluding braces) that is passed to the type specific formatter function. This can be used to provide extensions for specific argument types (e.g., IP addresses). The base logic ignores it but passes it on to the formatting function which can then behave different based on the extension.

Usage Examples

Some examples, comparing snprintf and BufferWriter::print().

if (len > 0) {
   auto n = snprintf(buff, len, "count %d", count);
   len -= n;
   buff += n;
}

bw.print("count {}", count);

// --

if (len > 0) {
   auto n = snprintf(buff, len, "Size %" PRId64 " bytes", sizeof(thing));
   len -= n;
   buff += n;
}

bw.print("Size {} bytes", sizeof(thing));

// --

if (len > 0) {
   auto n = snprintf(buff, len, "Number of items %ld", thing->count());
   len -= n;
   buff += n;
}

bw.print("Number of items {}", thing->count());

Enumerations become easier. Note in this case argument indices are used in order to print both a name and a value for the enumeration. A key benefit here is the lack of need for a developer to know the specific free function or method needed to do the name lookup. In this case, HttpDebugNames::get_server_state_name. Rather than every developer having to memorize the association between the type and the name lookup function, or grub through the code hoping for an example, the compiler is told once and henceforth does the lookup. The internal implementation of this is here

if (len > 0) {
   auto n = snprintf(buff, len, "Unexpected event %d in state %s[%d] for %.*s",
      event,
      HttpDebugNames::get_server_state_name(t_state.current.state),
      t_state.current.state,
      static_cast<int>(host_len), host);
   buff += n;
   len -= n;
}

bw.print("Unexpected event {0} in state {1}[{1:d}] for {2}",
   event, t_state.current.state, std::string_view{host, host_len});

Using std::string, which illustrates the advantage of a formatter overloading knowing how to get the size from the object and not having to deal with restrictions on the numeric type (e.g., that %.*s requires an int, not a size_t).

if (len > 0) {
   len -= snprintf(buff, len, "%.*s", static_cast<int>(s.size()), s.data);
}

bw.print("{}", s);

IP addresses are much easier. There are two big advantages here. One is not having to know the conversion function name. The other is the lack of having to declare local variables and having to remember what the appropriate size is. Beyond there this code is more performant because the output is rendered directly in the output buffer, not rendered to a temporary and then copied over. This lack of local variables can be particularly nice in the context of a switch statement where local variables for a case mean having to add extra braces, or declare the temporaries at an outer scope.

char ip_buff1[INET6_ADDRPORTSTRLEN];
char ip_buff2[INET6_ADDRPORTSTRLEN];
ats_ip_nptop(ip_buff1, sizeof(ip_buff1), addr1);
ats_ip_nptop(ip_buff2, sizeof(ip_buff2), add2);
if (len > 0) {
   snprintf(buff, len, "Connecting to %s from %s", ip_buff1, ip_buff2);
}

bw.print("Connecting to {} from {}", addr1, addr2);

User Defined Formatting

To get the full benefit of type safe formatting it is necessary to provide type specific formatting functions which are called when a value of that type is formatted. This is how type specific knowledge such as the names of enumeration values are encoded in a single location. Additional type specific formatting can be provided via the extension field. Without this, special formatting requires extra functions and additional work at the call site, rather than a single consolidated formatting function.

To provide a formatter for a type V the function bwformat is overloaded. The signature would look like this:

BufferWriter& ts::bwformat(BufferWriter& w, BWFSpec const& spec, V const& v)

w is the output and spec the parsed specifier, including the extension (if any). The calling framework will handle basic alignment as per spec therefore the overload does not need to unless the alignment requirements are more detailed (e.g. integer alignment operations) or performance is critical. In the latter case the formatter should make sure to use at least the minimum width in order to disable any additional alignment operation.

It is important to note that a formatter can call another formatter. For example, the formatter for pointers looks like:

// Pointers that are not specialized.
inline BufferWriter &
bwformat(BufferWriter &w, BWFSpec const &spec, const void * ptr)
{
   BWFSpec ptr_spec{spec};
   ptr_spec._radix_lead_p = true;
   if (ptr_spec._type == BWFSpec::DEFAULT_TYPE || ptr_spec._type == 'p') {
      // if default or specifically 'p', switch to lower case hex.
      ptr_spec._type = 'x';
   } else if (ptr_spec._type == 'P') {
      // Incoming 'P' means upper case hex.
      ptr_spec._type = 'X';
   }
   return bw_fmt::Format_Integer(w, ptr_spec,
      reinterpret_cast<intptr_t>(ptr), false);
}

The code checks if the type p or P was used in order to select the appropriate case, then delegates the actual rendering to the integer formatter with a type of x or X as appropriate. In turn other formatters, if given the type p or P can cast the value to const void* and call bwformat on that to output the value as a pointer.

To help reduce duplication, the output stream operator operator<< is defined to call this function with a default constructed BWFSpec instance so that absent a specific overload a BWF formatter will also provide a C++ stream output operator.

Enum Example

For a specific example of using BufferWriter formatting to make debug messages easier, consider the case of HttpDebugNames. This is a class that serves as a namespace to provide various methods that convert state machine related data into descriptive strings. Currently this is undocumented (and even uncommented) and is therefore used infrequently, as that requires either blind cut and paste, or tracing through header files to understand the code. This can be greatly simplified by adding formatters to proxy/http/HttpDebugNames.h

inline ts::BufferWriter &
bwformat(ts::BufferWriter &w, ts::BWFSpec const &spec, HttpTransact::ServerState_t state)
{
   if (spec.has_numeric_type()) {
      // allow the user to force numeric output with '{:d}' or other numeric type.
      return bwformat(w, spec, static_cast<uintmax_t>(state));
   } else {
      return bwformat(w, spec, HttpDebugNames::get_server_state_name(state));
   }
}

With this in place, any one wanting to print the name of the server state enumeration can do

bw.print("state {}", t_state.current_state);

There is no need to remember names like HttpDebugNames nor which method in it does the conversion. The developer making the HttpDebugNames class or equivalent can take care of that in the same header file that provides the type.

Note

In actual practice, due to this method being so obscure it’s not actually used as far as I can determine.

Argument Forwarding

It will frequently be useful for other libraries to allow local formatting (such as Errata). For such cases the class methods will need to take variable arguments and then forward them on to the formatter. BufferWriter provides the BufferWriter::printv() overload for this purpose. Instead of taking variable arguments, these overloads take a std::tuple of arguments. Such as tuple is easily created with std::forward_as_tuple. A standard implementation that uses the std::string overload for bwprint() would look like

template < typename ... Args >
std::string message(string_view fmt, Args &&... args) {
   std::string zret;
   return ts::bwprint(zret, fmt, std::forward_as_tuple(args...));
}

This gathers the argument (generally references to the arguments) in to a single tuple which is then passed by reference, to avoid restacking the arguments for every nested function call. In essence the arguments are put on the stack (inside the tuple) once and a reference to that stack is passed to nested functions.

Specialized Types

These are types for which there exists a type specific BWF formatter.

std::string_view

Generally the contents of the view.

‘x’ or ‘X’

A hexadecimal dump of the contents of the view in lower (‘x’) or upper (‘X’) case.

‘p’ or ‘P’

The pointer and length value of the view in lower (‘p’) or upper (‘P’) case.

The precision is interpreted specially for this type to mean “skip precision initial characters”. When combined with max this allows a mechanism for printing substrings of the std::string_view. For instance, to print the 10th through 20th characters the format {:.10,20} would suffice. Given the method substr for std::string_view is cheap, it’s unclear how useful this is.

sockaddr const*

The IP address is printed. Fill is used to fill in address segments if provided, not to the minimum width if specified. IpEndpoint and IpAddr are supported with the same formatting. The formatting support in this case is extensive because of the commonality and importance of IP address data.

Type overrides

‘p’ or ‘P’

The pointer address is printed as hexadecimal lower (‘p’) or upper (‘P’) case.

The extension can be used to control which parts of the address are printed. These can be in any order, the output is always address, port, family. The default is the equivalent of “ap”. In addition, the character ‘=’ (“numeric align”) can be used to internally right justify the elements.

‘a’

The address.

‘p’

The port (host order).

‘f’

The IP address family.

‘=’

Internally justify the numeric values. This must be the first or second character. If it is the second the first character is treated as the internal fill character. If omitted ‘0’ (zero) is used.

E.g.

void func(sockaddr const* addr) {
  bw.print("To {}", addr); // -> "To 172.19.3.105:4951"
  bw.print("To {0::a} on port {0::p}", addr); // -> "To 172.19.3.105 on port 4951"
  bw.print("To {::=}", addr); // -> "To 127.019.003.105:04951"
  bw.print("Using address family {::f}", addr);
  bw.print("{::a}",addr);      // -> "172.19.3.105"
  bw.print("{::=a}",addr);     // -> "172.019.003.105"
  bw.print("{::0=a}",addr);    // -> "172.019.003.105"
  bw.print("{:: =a}",addr);    // -> "172. 19.  3.105"
  bw.print("{:>20:a}",addr);   // -> "        172.19.3.105"
  bw.print("{:>20:=a}",addr);  // -> "     172.019.003.105"
  bw.print("{:>20: =a}",addr); // -> "     172. 19.  3.105"
}

Format Classes

Although the extension for a format can be overloaded to provide additional features, this can become too confusing and complex to use if it is used for fundamentally different semantics on the same based type. In that case it is better to provide a format wrapper class that holds the base type but can be overloaded to produce different (wrapper class based) output. The classic example is errno which is an integral type but frequently should be formatted with additional information such as the descriptive string for the value. To do this the format wrapper class ts::bwf::Errno is provided. Using it is simple:

w.print("File not open - {}", ts::bwf::Errno(errno));

which will produce output that looks like

“File not open - EACCES: Permission denied [13]”

For errno this is handy in another way as ts::bwf::Errno will preserve the value of errno across other calls that might change it. E.g.:

ts::bwf::Errno last_err(errno);
// some other code generating diagnostics that might tweak errno.
w.print("File not open - {}", last_err);

This can also be useful for user defined data types. For instance, in the HostDB the type of the entry is printed in multiple places and each time this code is repeated

"%s%s %s", r->round_robin ? "Round-Robin" : "",
   r->reverse_dns ? "Reverse DNS" : "", r->is_srv ? "SRV" : "DNS"

This could be wrapped in a class, HostDBType such as

struct HostDBType {
   HostDBInfo* _r { nullptr };
   HostDBType(r) : _r(r) {}
};

Then define a formatter for the wrapper

BufferWriter& bwformat(BufferWriter& w, BWFSpec const& spec, HostDBType const& wrap) {
  return w.print("{}{} {}", wrap._r->round_robin ? "Round-Robin" : "",
     r->reverse_dns ? "Reverse DNS" : "",
     r->is_srv ? "SRV" : "DNS");
}

Now this can be output elsewhere with just

w.print(“{}”, HostDBType(r));

If this is used multiple places, this is cleaner and more robust as it can be updated everywhere with a change in a single code location.

These are the existing format classes in header file bfw_std_format.h. All are in the ts::bwf namespace.

class Errno

Formatting for errno. Generically the formatted output is the short name, the description, and the numeric value. A format type of d will generate just the numeric value, while a format type of s will generate just the short name and description.

Errno(int errno)

Initialize the instance with the error value errno.

template<typename ...Args>
FirstOf(Args&&... args)

Print the first non-empty string in an argument list. All arguments must be convertible to std::string_view.

By far the most common case is the two argument case used to print a special string if the base string is null or empty. For instance, something like this:

w.print("{}", name != nullptr ? name : "<void>")

This could also be done like:

w.print("{}", ts::bwf::FirstOf(name, "<void>"));

In addition, if the first argument is a local variable that exists only to do the empty check, that variable can eliminated entirely. E.g.:

const char * name = thing.get_name();
w.print("{}", name != nullptr ? name : "<void>")

can be simplified to

w.print(“{}”, ts::bwf::FirstOf(thing.get_name(), “<void>”));

In general avoiding ternary operators in the print argument list makes the code cleaner and easier to understand.

class Date

Date formatting in the strftime style.

Date(time_t epoch, std::string_view fmt = "%Y %b %d %H:%M:%S")

epoch is the time to print. fmt is the format for printing which is identical to that of strftime. The default format looks like “2018 Jun 08 13:55:37”.

Date(std::string_view fmt = "%Y %b %d %H:%M:%S")

As previous except the epoch is the current epoch at the time the constructor is invoked. Therefore if the current time is to be printed the default constructor can be used.

When used the format specification can take an extension of “local” which formats the time as local time. Otherwise it is GMT. w.print("{}", Date("%H:%M")); will print the hour and minute as GMT values. w.print("{::local}", Date("%H:%M")); will When used the format specification can take an extension of “local” which formats the time as local time. Otherwise it is GMT. w.print("{}", Date("%H:%M")); will print the hour and minute as GMT values. w.print("{::local}", Date("%H:%M")); will print the hour and minute in the local time zone. w.print("{::gmt}"), ...); will output in GMT if additional explicitness is desired.

class OptionalAffix

Affix support for printing optional strings. This enables printing a string such the affixes are printed only if the string is not empty. An empty string (or nullptr) yields no output. A common situation in which is this is useful is code like

printf("%s%s", data ? data : "", data ? " " : "");

or something like

if (data) {
   printf("%s ", data);
}

Instead OptionalAffix can be used in line, which is easier if there are multiple items. E.g.

w.print(“{}”, ts::bwf::OptionalAffix(data)); // because default is single trailing space suffix.

OptionalAffix(const char *text, std::string_view suffix = " ", std::string_view prefix = "")

Create a format wrapper with suffix and prefix. If text is nullptr or is empty generate no output. Otherwise print the prefix, text, suffix.

OptionalAffix(std::string_view text, std::string_view suffix = " ", std::string_view prefix = "")

Create a format wrapper with suffix and prefix. If text is nullptr or is empty generate no output. Otherwise print the prefix, text, suffix. Note that passing std::string as the first argument will work for this overload.

Global Names

As a convenience, there are a few predefined global names that can be used to generate output. These do not take any arguments to BufferWriter::print(), the data needed for output is either process or thread global and is retrieved directly. They also are not counted for automatic indexing.

now

The epoch time in seconds.

tick

The high resolution clock tick.

timestamp

UTC time in the format “Year Month Date Hour:Minute:Second”, e.g. “2018 Apr 17 14:23:47”.

thread-id

The id of the current thread.

thread-name

The name of the current thread.

ts-thread

A pointer to the Traffic Server Thread object for the current thread. This is useful for comparisons.

ts-ethread

A pointer to the Traffic Server EThread object for the current thread. This is useful for comparisons or to indicate if the thread is an EThread (if not, the value will be nullptr).

For example, to have the same output as the normal diagnostic messages with a timestamp and the current thread:

bw.print("{timestamp} {ts-thread} Counter is {}", counter);

Note that even though no argument is provided the global names do not count as part of the argument indexing, therefore the preceding example could be written as:

bw.print("{timestamp} {ts-thread} Counter is {0}", counter);

Working with standard I/O

BufferWriter can be used with some of the basic I/O functionality of a C++ environment. At the lowest level the output stream operator can be used with a file descriptor or a std::ostream. For these examples assume bw is an instance of BufferWriter with data in it.

int fd = open("some_file", O_RDWR);
bw >> fd; // Write to file.
bw >> std::cout; // write to standard out.

For convenience a stream operator for std::stream is provided to make the use more natural.

std::cout << bw;
std::cout << bw.view(); // identical effect as the previous line.

Using a BufferWriter with printf is straight forward by use of the sized string format code.

ts::LocalBufferWriter<256> bw;
bw.print("Failed to connect to {}", addr1);
printf("%.*s\n", static_cast<int>(bw.size()), bw.data());

Alternatively the output can be null terminated in the formatting to avoid having to pass the size.

ts::LocalBufferWriter<256> bw;
printf("%s\n", bw.print("Failed to connect to {}\0", addr1).data());

When using C++ stream I/O, writing to a stream can be done without any local variables at all.

std::cout << ts::LocalBufferWriter<256>().print("Failed to connect to {}\n", addr1);

This is handy for temporary debugging messages as it avoids having to clean up local variable declarations later, particularly when the types involved themselves require additional local declarations (such as in this example, an IP address which would normally require a local text buffer for conversion before printing). As noted previously this is particularly useful inside a case where local variables are more annoying to set up.

Reference

class BufferWriter

BufferWriter is the abstract base class which defines the basic client interface. This is intended to be the reference type used when passing concrete instances rather than having to support the distinct types.

BufferWriter &write(void *data, size_t length)

Write to the buffer starting at data for at most length bytes. If there is not enough room to fit all the data, none is written.

BufferWriter &write(std::string_view str)

Write the string str to the buffer. If there is not enough room to write the string no data is written.

BufferWriter &write(char c)

Write the character c to the buffer. If there is no space in the buffer the character is not written.

BufferWriter &fill(size_t n)

Increase the output size by n without changing the buffer contents. This is used in conjunction with BufferWriter::auxBuffer() after writing output to the buffer returned by that method. If this method is not called then such output will not be counted by BufferWriter::size() and will be overwritten by subsequent output.

char *data() const

Return a pointer to start of the buffer.

size_t size() const

Return the number of valid (written) bytes in the buffer.

std::string_view view() const

Return a std::string_view that covers the valid data in the buffer.

size_t remaining() const

Return the number of available remaining bytes that could be written to the buffer.

size_t capacity() const

Return the number of bytes in the buffer.

char *auxBuffer() const

Return a pointer to the first byte in the buffer not yet consumed.

BufferWriter &clip(size_t n)

Reduce the available space by n bytes.

BufferWriter &extend(size_t n)

Increase the available space by n bytes. Extreme care must be used with this method as BufferWriter will trust the argument, having no way to verify it. In general this should only be used after calling BufferWriter::clip() and passing the same value. Together these allow the buffer to be temporarily reduced to reserve space for the trailing element of a required pair of output strings, e.g. making sure a closing quote can be written even if part of the string is not.

bool error() const

Return true if the buffer has overflowed from writing, false if not.

size_t extent() const

Return the total number of bytes in all attempted writes to this buffer. This value allows a successful retry in case of overflow, presuming the output data doesn’t change. This works well with the standard “try before you buy” approach of attempting to write output, counting the characters needed, then allocating a sufficiently sized buffer and actually writing.

BufferWriter &print(TextView fmt, ...)

Print the arguments according to the format. See bw-formatting.

template<typename ...Args>
BufferWriter &printv(TextView fmt, std::tuple<Args...> &&args)

Print the arguments in the tuple args according to the format. See bw-formatting.

std::ostream &operator>>(std::ostream &stream) const

Write the contents of the buffer to stream and return stream.

ssize_t operator>>(int fd)

Write the contents of the buffer to file descriptor fd and return the number of bytes write (the results of the call to file write()).

class FixedBufferWriter : public BufferWriter

This is a class that implements BufferWriter on a fixed buffer, passed in to the constructor.

FixedBufferWriter(void *buffer, size_t length)

Construct an instance that will write to buffer at most length bytes. If more data is written, all data past the maximum size is discarded.

FixedBufferWriter &reduce(size_t n)

Roll back the output to have n valid (used) bytes.

FixedBufferWriter &reset()

Equivalent to reduce(0), provide for convenience.

FixedBufferWriter auxWriter(size_t reserve = 0)

Create a new instance of FixedBufferWriter for the remaining output buffer. If reserve is non-zero then if possible the capacity of the returned instance is reduced by reserve bytes, in effect reserving that amount of space at the end. Note the space will not be reserved if reserve is larger than the remaining output space.

template<size_t N>
class LocalBufferWriter : public BufferWriter

This is a convenience class which is a subclass of FixedBufferWriter. It which creates a buffer as a member rather than having an external buffer that is passed to the instance. The buffer is N bytes long. This differs from its super class only in the constructor, which is only a default constructor.

LocalBufferWriter::LocalBufferWriter()

Construct an instance with a capacity of N.

class BWFSpec

This holds a format specifier. It has the parsing logic for a specifier and if the constructor is passed a std::string_view of a specifier, that will parse it and loaded into the class members. This is useful to specialized implementations of bwformat().

template<typename V>
BufferWriter &bwformat(BufferWriter &w, BWFSpec const &spec, V const &v)

A family of overloads that perform formatted output on a BufferWriter. The set of types supported can be extended by defining an overload of this function for the types.

template<typename ...Args>
std::string &bwprint(std::string &s, std::string_view format, Args&&... args)

Generate formatted output in s based on the format and arguments args. The string s is adjusted in size to be the exact length as required by the output. If the string already had enough capacity it is not re-allocated, otherwise the resizing will cause a re-allocation.

template<typename ...Args>
std::string &bwprintv(std::string &s, std::string_view format, std::tuple<Args...> args)

Generate formatted output in s based on the format and args, which must be a tuple of the arguments to use for the format. The string s is adjusted in size to be the exact length as required by the output. If the string already had enough capacity it is not re-allocated, otherwise the resizing will cause a re-allocation.

This overload is used primarily as a back end to another function which takes the arguments for the formatting independently.