Request Tracing

The request tracing framework adds the ability to monitor transactions as they move through FoundationDB. Tracing provides a detailed view into where transactions spend time with data exported in near real-time, enabling fast performance debugging. The FoundationDB tracing framework is based off the OpenTracing specification.

Disambiguation: Trace files are local log files containing debug and error output from a local fdbserver binary. Request tracing produces similarly named traces which record the amount of time a transaction spent in a part of the system. This document uses the term tracing (or trace) to refer to these request traces, not local debug information, unless otherwise specified.

Note: Full request tracing capability requires at least TLogVersion::V6.

Recording data

The request tracing framework produces no data by default. To enable collection of traces, specify the collection type using the --tracer command line option for fdbserver and the DISTRIBUTED_CLIENT_TRACER network option for clients. Both client and server must have the same trace value set to perform correctly.

Option Description
none No tracing data is collected.
file, logfile, log_file Write tracing data to FDB trace files, specified with --logdir.
network_lossy Send tracing data as UDP packets. Data is sent to localhost:8889, but the default port can be changed by setting the TRACING_UDP_LISTENER_PORT knob. This option is useful if you have a log aggregation program to collect trace data.

Data format

Spans are the building blocks of traces. A span represents an operation in the life of a transaction, including the start and end timestamp and an operation. A collection of spans make up a trace, representing a single transaction. The tracing framework outputs individual spans, which can be reconstructed into traces through their parent relationships.

Trace data sent as UDP packets when using the network_lossy option is serialized using MessagePack. To save on the amount of data sent, spans are serialized as an array of length 8 (if the span has one or more parents), or length 7 (if the span has no parents).

The fields of a span are specified below. The index at which the field appears in the serialized msgpack array is also specified, for those using the UDP collection format.

Field Index Type Description
Source IP:port 0 string The IP and port of the machine where the span originated.
Trace ID 1 uint64 The 64-bit identifier of the trace. All spans in a trace share the same trace ID.
Span ID 2 uint64 The 64-bit identifier of the span. All spans have a unique identifier.
Start timestamp 3 double The timestamp when the operation represented by the span began.
Duration 4 double The duration in seconds of the operation represented by the span.
Operation name 5 string The name of the operation the span represents.
Tags 6 map User defined tags, added manually to specify additional information.
Parent span IDs 7 vector (Optional) A list of span IDs representing parents of this span.

Multiple parent spans

Unlike traditional distributed tracing frameworks, FoundationDB spans can have multiple parents. Because many FDB transactions are batched into a single transaction, to continue tracing the request, the batched transaction must treat all its component transactions as parents.

Control options

In addition to the command line parameter described above, tracing can be set at a database and transaction level.

Tracing can be controlled on a global level by setting the TRACING_SAMPLE_RATE knob. Set the knob to 0.0 to record no traces, to 1.0 to record all traces, or somewhere in the middle. Traces are sampled as a unit. All individual spans in the trace will be included in the sample.

Tracing can be enabled or disabled for individual transactions. The special key space exposes an API to set a custom trace ID for a transaction, or to disable tracing for the transaction. See the special key space tracing module documentation to learn more.