Python API

Installation

The FoundationDB Python API is compatible with Python 2.7 - 3.7. You will need to have a Python version within this range on your system before the FoundationDB Python API can be installed. Also please note that Python 3.7 no longer bundles a full copy of libffi, which is used for building the _ctypes module on non-macOS UNIX platforms. Hence, if you are using Python 3.7, you should make sure libffi is already installed on your system.

On macOS, the FoundationDB Python API is installed as part of the FoundationDB installation (see Installing FoundationDB client binaries). On Ubuntu or RHEL/CentOS, you will need to install the FoundationDB Python API manually via Python’s package manager pip:

user@host$ pip install foundationdb

You can also download the FoundationDB Python API source directly from Downloads.

Note

The Python language binding is compatible with FoundationDB client binaries of version 2.0 or higher. When used with version 2.0.x client binaries, the API version must be set to 200 or lower.

After installation, the module fdb should be usable from your Python installation or path. (The system default python is always used by the client installer on macOS.)

API versioning

When you import the fdb module, it exposes only one useful symbol:

fdb.api_version(version)

Specifies the version of the API that the application uses. This allows future versions of FoundationDB to make API changes without breaking existing programs. The current version of the API is 710.

Note

You must call fdb.api_version(...) before using any other part of the API. Once you have done so, the rest of the API will become available in the fdb module. This requirement includes use of the @fdb.transactional decorator, which is called when your module is imported.

Note

FoundationDB encapsulates multiple versions of its interface by requiring the client to explicitly specify the version of the API it uses. The purpose of this design is to allow you to upgrade the server, client libraries, or bindings without having to modify client code. The client libraries support all previous versions of the API. The API version specified by the client is used to control the behavior of the binding. You can therefore upgrade to more recent packages (and thus receive various improvements) without having to change your code.

Warning

When using the multi-version client API, setting an API version that is not supported by a particular client library will prevent that client from being used to connect to the cluster. In particular, you should not advance the API version of your application after upgrading your client until the cluster has also been upgraded.

For API changes between version 13 and 710 (for the purpose of porting older programs), see Release Notes and API Version Upgrade Guide.

Opening a database

After importing the fdb module and selecting an API version, you probably want to open a Database using open():

import fdb
fdb.api_version(710)
db = fdb.open()
fdb.open(cluster_file=None, event_model=None)

Connects to the cluster specified by the cluster file. This function is often called without any parameters, using only the defaults. If no cluster file is passed, FoundationDB automatically determines a cluster file with which to connect to a cluster.

A single client can use this function multiple times to connect to different clusters simultaneously, with each invocation requiring its own cluster file. To connect to multiple clusters running at different, incompatible versions, the multi-version client API must be used.

fdb.options

A singleton providing options which affect the entire FoundationDB client. Note that network options can also be set using environment variables.

Note

It is an error to set these options after the first call to fdb.open() anywhere in your application.

fdb.options.set_knob(knob)

Sets internal tuning or debugging knobs. The argument to this function should be a string representing the knob name and the value, e.g. “transaction_size_limit=1000”.

fdb.options.set_trace_enable(output_directory=None)

Enables trace file generation on this FoundationDB client. Trace files will be generated in the specified output directory. If the directory is specified as None, then the output directory will be the current working directory.

Warning

The specified output directory must be unique to this client. In the present release, trace logging does not allow two clients to share a directory.

fdb.options.set_trace_max_logs_size(bytes)

Sets the maximum size in bytes for the sum of this FoundationDB client’s trace output files in a single log directory.

fdb.options.set_trace_roll_size(bytes)

Sets the maximum size in bytes of a single trace output file for this FoundationDB client.

fdb.options.set_trace_format(format)

Select the format of the trace files for this FoundationDB client. xml (the default) and json are supported.

fdb.options.set_trace_clock_source(source)

Select clock source for trace files. now (the default) or realtime are supported.

fdb.options.set_disable_multi_version_client_api()

Disables the multi-version client API and instead uses the local client directly. Must be set before setting up the network.

fdb.options.set_callbacks_on_external_threads()

If set, callbacks from external client libraries can be called from threads created by the FoundationDB client library. Otherwise, callbacks will be called from either the thread used to add the callback or the network thread. Setting this option can improve performance when connected using an external client, but may not be safe to use in all environments. Must be set before setting up the network. WARNING: This feature is considered experimental at this time.

fdb.options.set_external_client_library(path_to_lib)

Adds an external client library for use by the multi-version client API. Must be set before setting up the network.

fdb.options.set_external_client_directory(path_to_lib_directory)

Searches the specified path for dynamic libraries and adds them to the list of client libraries for use by the multi-version client API. Must be set before setting up the network.

Note

The following options are only used when connecting to a TLS-enabled cluster.

fdb.options.set_tls_plugin(plugin_path_or_name)

Sets the TLS plugin to load. This option, if used, must be set before any other TLS options.

fdb.options.set_tls_cert_path(path_to_file)

Sets the path for the file from which the certificate chain will be loaded.

fdb.options.set_tls_key_path(path_to_file)

Sets the path for the file from which to load the private key corresponding to your own certificate.

fdb.options.set_tls_verify_peers(criteria)

Sets the peer certificate field verification criteria.

fdb.options.set_tls_cert_bytes(bytes)

Sets the certificate chain.

fdb.options.set_tls_key_bytes(bytes)

Set the private key corresponding to your own certificate.

fdb.options.set_tls_ca_bytes(ca_bundle)

Sets the certificate authority bundle.

fdb.options.set_tls_ca_path(path)

Sets the file from which to load the certificate authority bundle.

fdb.options.set_tls_password(password)

Sets the passphrase for encrypted private key. Password should be set before setting the key for the password to be used.

fdb.options.set_disable_local_client()

Prevents connections through the local client, allowing only connections through externally loaded client libraries.

fdb.options.set_client_threads_per_version(number)

Spawns multiple worker threads for each version of the client that is loaded. Setting this to a number greater than one implies disable_local_client.

fdb.options.set_disable_client_statistics_logging()

Disables logging of client statistics, such as sampled transaction activity.

fdb.options.set_enable_run_loop_profiling()

Enables debugging feature to perform run loop profiling. Requires trace logging to be enabled. WARNING: this feature is not recommended for use in production.

fdb.options.set_distributed_client_tracer(tracer_type)

Sets a tracer to run on the client. Should be set to the same value as the tracer set on the server.

Please refer to fdboptions.py (generated) for a comprehensive list of options.

Keys and values

Keys and values in FoundationDB are simple byte strings. In Python 2, a byte string is a string of type str. In Python 3, a byte string has type bytes.

To encode other data types, see Encoding data types and the tuple layer.

as_foundationdb_key and as_foundationdb_value

In some cases, you may have objects that are used to represent specific keys or values (for example, see Subspace). As a convenience, the language binding API can work seamlessly with such objects if they implement the as_foundationdb_key() or as_foundationdb_value() methods, respectively. API methods that accept a key will alternately accept an object that implements the as_foundationdb_key() method. Likewise, API methods accepting a value will also accept an object that implements the as_foundationdb_value() method.

Warning

as_foundationdb_key() and as_foundationdb_value() are not intended to implement serialization protocols for object storage. Use these functions only when your object represents a specific key or value.

KeyValue objects

class fdb.KeyValue

Represents a single key-value pair in the database. This is a simple value type; mutating it won’t affect your Transaction or Database.

KeyValue supports the Python iterator protocol so that you can unpack a key and value directly into two variables:

for key, value in tr[begin:end]:
    pass

Attributes

KeyValue.key
KeyValue.value

Key selectors

FoundationDB’s lexicographically ordered data model permits finding keys based on their order (for example, finding the first key in the database greater than a given key). Key selectors represent a description of a key in the database that could be resolved to an actual key by Transaction.get_key() or used directly as the beginning or end of a range in Transaction.get_range().

For more about how key selectors work, see Key selectors.

class fdb.KeySelector(key, or_equal, offset)

Creates a key selector with the given reference key, equality flag, and offset. It is usually more convenient to obtain a key selector with one of the following methods:

classmethod KeySelector.last_less_than(key)

Returns a key selector referencing the last (greatest) key in the database less than the specified key.

classmethod KeySelector.last_less_or_equal(key)

Returns a key selector referencing the last (greatest) key less than, or equal to, the specified key.

classmethod KeySelector.first_greater_than(key)

Returns a key selector referencing the first (least) key greater than the specified key.

classmethod KeySelector.first_greater_or_equal(key)

Returns a key selector referencing the first key greater than, or equal to, the specified key.

KeySelector + offset

Adding an integer offset to a KeySelector returns a key selector referencing a key offset keys after the original KeySelector. FoundationDB does not efficiently resolve key selectors with large offsets, so Key selectors with large offsets are slow.

KeySelector - offset

Subtracting an integer offset from a KeySelector returns a key selector referencing a key offset keys before the original KeySelector. FoundationDB does not efficiently resolve key selectors with large offsets, so Key selectors with large offsets are slow.

Database objects

class fdb.Database

A Database represents a FoundationDB database — a mutable, lexicographically ordered mapping from binary keys to binary values. Although Database provides convenience methods for reading and writing, modifications to a database are usually via transactions, which are usually created and committed automatically by the @fdb.transactional decorator.

Note

The convenience methods provided by Database have the same signature as the corresponding methods of Transaction. However, most of the Database methods are fully synchronous. (An exception is the methods for watches.) As a result, the Database methods do not support the use of implicit parallelism with futures.

Database.create_transaction()

Returns a new Transaction object. Consider using the @fdb.transactional decorator to create transactions instead, since it will automatically provide you with appropriate retry behavior.

Database.open_tenant(tenant_name)

Opens an existing tenant to be used for running transactions and returns it as a :class`Tenant` object.

The tenant name can be either a byte string or a tuple. If a tuple is provided, the tuple will be packed using the tuple layer to generate the byte string tenant name.

Note

Opening a tenant does not check its existence in the cluster. If the tenant does not exist, attempts to read or write data with it will fail.

Database.get(key)

Returns the value associated with the specified key in the database (or None if the key does not exist). This read is fully synchronous.

X = db[key]

Shorthand for X = db.get(key).

Database.get_key(key_selector)

Returns the key referenced by the specified KeySelector. This read is fully synchronous.

The key is cached, providing a potential performance benefit. However, the value of the key is also retrieved, using network bandwidth.

Database.get_range(begin, end[, limit, reverse, streaming_mode])

Returns all keys k such that begin <= k < end and their associated values as a list of KeyValue objects. Note the exclusion of end from the range. This read is fully synchronous.

Each of begin and end may be a key or a KeySelector. Note that in the case of a KeySelector, the exclusion of end from the range still applies.

If limit is specified, then only the first limit keys (and their values) in the range will be returned.

If reverse is True, then the last limit keys in the range will be returned in reverse order. Reading ranges in reverse is supported natively by the database and should have minimal extra cost.

If streaming_mode is specified, it must be a value from the StreamingMode enumeration. It provides a hint to FoundationDB about how to retrieve the specified range. This option should generally not be specified, allowing FoundationDB to retrieve the full range very efficiently.

X = db[begin:end]

Shorthand for X = db.get_range(begin, end). The default slice begin is ''; the default slice end is '\xFF'.

X = db[begin:end:-1]

Shorthand for X = db.get_range(begin, end, reverse=True). The default slice begin is ''; the default slice end is '\xFF'.

Database.get_range_startswith(prefix[, limit, reverse, streaming_mode])

Returns all keys k such that k.startswith(prefix), and their associated values, as a list of KeyValue objects. The limit, reverse and streaming_mode parameters have the same meanings as in Database.get_range().

Database.set(key, value)

Associates the given key and value. Overwrites any prior value associated with key. This change will be committed immediately, and is fully synchronous.

db[key] = value

Shorthand for db.set(key, value).

Database.clear(key)

Removes the specified key (and any associated value), if it exists. This change will be committed immediately, and is fully synchronous.

del db[key]

Shorthand for db.clear(key).

Database.clear_range(begin, end)

Removes all keys k such that begin <= k < end, and their associated values. This change will be committed immediately, and is fully synchronous.

del db[begin:end]

Shorthand for db.clear_range(begin, end). The default slice begin is ''; the default slice end is '\xFF'.

Database.clear_range_startswith(prefix)

Removes all keys k such that k.startswith(prefix), and their associated values. This change will be committed immediately, and is fully synchronous.

Database.get_and_watch(key)

Returns a tuple value, watch, where value is the value associated with key or None if the key does not exist, and watch is a FutureVoid that will become ready after value changes.

See Transaction.watch() for a general description of watches and their limitations.

Database.set_and_watch(key, value)

Sets key to value and returns a FutureVoid that will become ready after a subsequent change to value.

See Transaction.watch() for a general description of watches and their limitations.

Database.clear_and_watch(key)

Removes key (and any associated value) if it exists and returns a FutureVoid that will become ready after the value is subsequently set.

See Transaction.watch() for a general description of watches and their limitations.

Database.add(key, param)
Database.bit_and(key, param)
Database.bit_or(key, param)
Database.bit_xor(key, param)

These atomic operations behave exactly like the associated operations on Transaction objects except that the change will immediately be committed, and is fully synchronous.

Note

Note that since some atomic operations are not idempotent, the implicit use of the @fdb.transactional decorator could interact with a commit_unknown_result exception in unpredictable ways. For more information, see Transactions with unknown results.

Database options

Database options alter the behavior of FoundationDB databases.

Database.options.set_location_cache_size(size)

Set the size of the client location cache. Raising this value can boost performance in very large databases where clients access data in a near-random pattern. This value must be an integer in the range [0, 231-1]. Defaults to 100000.

Database.options.set_max_watches(max_watches)

Set the maximum number of watches allowed to be outstanding on a database connection. Increasing this number could result in increased resource usage. Reducing this number will not cancel any outstanding watches. Defaults to 10000 and cannot be larger than 1000000.

Database.options.set_machine_id(id)

Specify the machine ID of a server to be preferentially used for database operations. ID must be a string of up to 16 hexadecimal digits that was used to configure fdbserver processes. Load balancing uses this option for location-awareness, attempting to send database operations first to servers on a specified machine, then a specified datacenter, then returning to its default algorithm.

Database.options.set_datacenter_id(id)

Specify the datacenter ID to be preferentially used for database operations. ID must be a string of up to 16 hexadecimal digits that was used to configure fdbserver processes. Load balancing uses this option for location-awareness, attempting to send database operations first to servers on a specified machine, then a specified datacenter, then returning to its default algorithm.

Database.options.set_transaction_timeout(timeout)

Set the default timeout duration in milliseconds after which all transactions created by this database will automatically be cancelled. This is equivalent to calling Transaction.options.set_timeout() on each transaction created by this database. This option can only be called if the API version is at least 610.

Database.options.set_transaction_retry_limit(retry_limit)

Set the default maximum number of retries for each transaction after which additional calls to Transaction.on_error() will throw the most recently seen error code. This is equivalent to calling Transaction.options.set_retry_limit() on each transaction created by this database.

Database.options.set_transaction_max_retry_delay(delay_limit)

Set the default maximum backoff delay incurred by each transaction in the call to Transaction.on_error() if the error is retryable. This is equivalent to calling Transaction.options.set_max_retry_delay() on each transaction created by this database.

Database.options.set_transaction_size_limit(size_limit)

Set the default maximum transaction size in bytes. This is equivalent to calling Database.options.set_transaction_size_limit() on each transaction created by this database.

Database.options.set_transaction_causal_read_risky()

Transactions do not require the strict causal consistency guarantee that FoundationDB provides by default. The read version will be committed, and usually will be the latest committed, but might not be the latest committed in the event of a simultaneous fault and misbehaving clock. Enabling this option is equivalent to calling Transaction.options.set_causal_read_risky() on each transaction created by this database.

Database.options.set_transaction_logging_max_field_length(size_limit)

Sets the maximum escaped length of key and value fields to be logged to the trace file via the LOG_TRANSACTION option. This is equivalent to calling Transaction.options.set_transaction_logging_max_field_length() on each transaction created by this database.

Database.options.set_snapshot_ryw_enable()

If this option has been set an equal or more times with this database than the disable option, snapshot reads will see the effects of prior writes in the same transaction. Enabling this option is equivalent to calling Transaction.options.set_snapshot_ryw_enable() on each transaction created by this database.

Database.options.set_snapshot_ryw_disable()

If this option has been set more times with this database than the disable option, snapshot reads will not see the effects of prior writes in the same transaction. Disabling this option is equivalent to calling Transaction.options.set_snapshot_ryw_disable() on each transaction created by this database.

Tenant objects

class fdb.Tenant

Tenant represents a FoundationDB tenant. Tenants are optional named transaction domains that can be used to provide multiple disjoint key-spaces to client applications. A transaction created in a tenant will be limited to the keys contained within that tenant, and transactions operating on different tenants can use the same key names without interfering with each other.

Tenant.create_transaction()

Returns a new Transaction object. Consider using the @fdb.transactional decorator to create transactions instead, since it will automatically provide you with appropriate retry behavior.

Transactional decoration

@fdb.transactional

The @fdb.transactional decorator is a convenience designed to concisely wrap a function with logic to automatically create a transaction and retry until success.

For example:

@fdb.transactional
def simple_function(tr, x, y):
    tr[b'foo'] = x
    tr[b'bar'] = y

The @fdb.transactional decorator makes simple_function a transactional function. All functions using this decorator must have an argument named tr. This specially named argument is passed a transaction that the function can use to do reads and writes.

A caller of a transactionally decorated function can pass a Database or Tenant instead of a transaction for the tr parameter. Then a transaction will be created automatically, and automatically committed before returning to the caller. The decorator will retry calling the decorated function until the transaction successfully commits.

If db is a Database or Tenant, a call like

simple_function(db, 'a', 'b')

is equivalent to something like

tr = db.create_transaction()
while True:
    try:
        simple_function(tr, 'a', 'b')
        tr.commit().wait()
        break
    except fdb.FDBError as e:
        tr.on_error(e).wait()

A caller may alternatively pass an actual transaction to the tr parameter. In this case, the transactional function will not attempt to commit the transaction or to retry errors, since that is the responsibility of the caller who owns the transaction. This design allows transactionally decorated functions to be composed freely into larger transactions.

Note

In some failure scenarios, it is possible that your transaction will be executed twice. See Transactions with unknown results for more information.

Transaction objects

class fdb.Transaction

A Transaction object represents a FoundationDB database transaction. All operations on FoundationDB take place, explicitly or implicitly, through a Transaction.

In FoundationDB, a transaction is a mutable snapshot of a database. All read and write operations on a transaction see and modify an otherwise-unchanging version of the database and only change the underlying database if and when the transaction is committed. Read operations do see the effects of previous write operations on the same transaction. Committing a transaction usually succeeds in the absence of conflicts.

Transactions group operations into a unit with the properties of atomicity, isolation, and durability. Transactions also provide the ability to maintain an application’s invariants or integrity constraints, supporting the property of consistency. Together these properties are known as ACID.

Transactions are also causally consistent: once a transaction has been successfully committed, all subsequently created transactions will see the modifications made by it.

The most convenient way to use Transactions is using the @fdb.transactional decorator.

Keys and values in FoundationDB are byte strings (str in Python 2.x, bytes in 3.x). To encode other data types, see the fdb.tuple module and Encoding data types.

Attributes

Transaction.db

The Database that this transaction is interacting with.

Reading data

Transaction.get(key)

Returns a (future) Value associated with the specified key in the database.

To check whether the specified key was present in the database, call Value.present() on the return value.

X = tr[key]

Shorthand for X = tr.get(key).

Transaction.get_key(key_selector)

Returns the (future) Key referenced by the specified KeySelector.

By default, the key is cached for the duration of the transaction, providing a potential performance benefit. However, the value of the key is also retrieved, using network bandwidth. Invoking Transaction.options.set_read_your_writes_disable() will avoid both the caching and the increased network bandwidth.

Transaction.get_range(begin, end[, limit, reverse, streaming_mode])

Returns all keys k such that begin <= k < end and their associated values as an iterator yielding KeyValue objects. Note the exclusion of end from the range.

Like a Future object, the returned iterator issues asynchronous read operations. It fetches the data in one or more efficient batches (depending on the value of the streaming_mode parameter). However, the iterator will block if iteration reaches a value whose read has not yet completed.

Each of begin and end may be a key or a KeySelector. Note that in the case of a KeySelector, the exclusion of end from the range still applies.

If limit is specified, then only the first limit keys (and their values) in the range will be returned.

If reverse is True, then the last limit keys in the range will be returned in reverse order. Reading ranges in reverse is supported natively by the database and should have minimal extra cost.

If streaming_mode is specified, it must be a value from the StreamingMode enumeration. It provides a hint to FoundationDB about how the returned container is likely to be used. The default is StreamingMode.iterator.

X = tr[begin:end]

Shorthand for X = tr.get_range(begin, end). The default slice begin is ''; the default slice end is '\xFF'.

X = tr[begin:end:-1]

Shorthand for X = tr.get_range(begin, end, reverse=True). The default slice begin is ''; the default slice end is '\xFF'.

Transaction.get_range_startswith(prefix[, limit, reverse, streaming_mode])

Returns all keys k such that k.startswith(prefix), and their associated values, as a container of KeyValue objects (see Transaction.get_range() for a description of the returned container).

The limit, reverse and streaming_mode parameters have the same meanings as in Transaction.get_range().

Snapshot reads

Transaction.snapshot

Snapshot reads selectively relax FoundationDB’s isolation property, reducing conflicts but making it harder to reason about concurrency.

By default, FoundationDB transactions guarantee strictly serializable isolation, resulting in a state that is as if transactions were executed one at a time, even if they were executed concurrently. Serializability has little performance cost when there are few conflicts but can be expensive when there are many. FoundationDB therefore also permits individual reads within a transaction to be done as snapshot reads.

Snapshot reads differ from ordinary (strictly serializable) reads by permitting the values they read to be modified by concurrent transactions, whereas strictly serializable reads cause conflicts in that case. Like strictly serializable reads, snapshot reads see the effects of prior writes in the same transaction. For more information on the use of snapshot reads, see Snapshot reads.

Snapshot reads also interact with transaction commit a little differently than normal reads. If a snapshot read is outstanding when transaction commit is called that read will immediately return an error. (Normally, transaction commit will wait until outstanding reads return before committing.)

Transaction.snapshot.db

The Database that this transaction is interacting with.

Transaction.snapshot.get(key)

Like Transaction.get(), but as a snapshot read.

X = tr.snapshot[key]

Shorthand for X = tr.snapshot.get(key).

Transaction.snapshot.get_key(key_selector) key

Like Transaction.get_key(), but as a snapshot read.

Transaction.snapshot.get_range(begin, end[, limit, reverse, streaming_mode])

Like Transaction.get_range(), but as a snapshot read.

X = tr.snapshot[begin:end]

Shorthand for X = tr.snapshot.get_range(begin, end). The default slice begin is ''; the default slice end is '\xFF'.

X = tr.snapshot[begin:end:-1]

Shorthand for X = tr.snapshot.get_range(begin, end, reverse=True). The default slice begin is ''; the default slice end is '\xFF'.

Transaction.snapshot.get_range_startswith(prefix[, limit, reverse, streaming_mode])

Like Transaction.get_range_startswith(), but as a snapshot read.

Transaction.snapshot.get_read_version()

Identical to Transaction.get_read_version() (since snapshot and strictly serializable reads use the same read version).

Writing data

Transaction.set(key, value)

Associates the given key and value. Overwrites any prior value associated with key. Returns immediately, having modified the snapshot represented by this Transaction.

tr[key] = value

Shorthand for tr.set(key,value).

Transaction.clear(key)

Removes the specified key (and any associated value), if it exists. Returns immediately, having modified the snapshot represented by this Transaction.

del tr[key]

Shorthand for tr.clear(key).

Transaction.clear_range(begin, end)

Removes all keys k such that begin <= k < end, and their associated values. Returns immediately, having modified the snapshot represented by this Transaction.

Range clears are efficient with FoundationDB – clearing large amounts of data will be fast. However, this will not immediately free up disk - data for the deleted range is cleaned up in the background. For purposes of computing the transaction size, only the begin and end keys of a clear range are counted. The size of the data stored in the range does not count against the transaction size limit.

Note

Unlike in the case of get_range(), begin and end must be keys (byte strings), not KeySelectors. (Resolving arbitrary key selectors would prevent this method from returning immediately, introducing concurrency issues.)

del tr[begin:end]

Shorthand for tr.clear_range(begin,end). The default slice begin is ''; the default slice end is '\xFF'.

Transaction.clear_range_startswith(prefix)

Removes all the keys k such that k.startswith(prefix), and their associated values. Returns immediately, having modified the snapshot represented by this Transaction.

Range clears are efficient with FoundationDB – clearing large amounts of data will be fast. However, this will not immediately free up disk - data for the deleted range is cleaned up in the background. For purposes of computing the transaction size, only the begin and end keys of a clear range are counted. The size of the data stored in the range does not count against the transaction size limit.

Atomic operations

An atomic operation is a single database command that carries out several logical steps: reading the value of a key, performing a transformation on that value, and writing the result. Different atomic operations perform different transformations. Like other database operations, an atomic operation is used within a transaction; however, its use within a transaction will not cause the transaction to conflict.

Atomic operations do not expose the current value of the key to the client but simply send the database the transformation to apply. In regard to conflict checking, an atomic operation is equivalent to a write without a read. It can only cause other transactions performing reads of the key to conflict.

By combining these logical steps into a single, read-free operation, FoundationDB can guarantee that the transaction will not conflict due to the operation. This makes atomic operations ideal for operating on keys that are frequently modified. A common example is the use of a key-value pair as a counter.

Warning

If a transaction uses both an atomic operation and a strictly serializable read on the same key, the benefits of using the atomic operation (for both conflict checking and performance) are lost.

In each of the methods below, param should be a string appropriately packed to represent the desired value. For example:

# wrong
tr.add('key', 1)

# right
import struct
tr.add('key', struct.pack('<q', 1))
Transaction.add(key, param)

Performs an addition of little-endian integers. If the existing value in the database is not present or shorter than param, it is first extended to the length of param with zero bytes. If param is shorter than the existing value in the database, the existing value is truncated to match the length of param. In case of overflow, the result is truncated to the width of param.

The integers to be added must be stored in a little-endian representation. They can be signed in two’s complement representation or unsigned. You can add to an integer at a known offset in the value by prepending the appropriate number of zero bytes to param and padding with zero bytes to match the length of the value. However, this offset technique requires that you know the addition will not cause the integer field within the value to overflow.

Transaction.bit_and(key, param)

Performs a bitwise “and” operation. If the existing value in the database is not present, then param is stored in the database. If the existing value in the database is shorter than param, it is first extended to the length of param with zero bytes. If param is shorter than the existing value in the database, the existing value is truncated to match the length of param.

Transaction.bit_or(key, param)

Performs a bitwise “or” operation. If the existing value in the database is not present or shorter than param, it is first extended to the length of param with zero bytes. If param is shorter than the existing value in the database, the existing value is truncated to match the length of param.

Transaction.bit_xor(key, param)

Performs a bitwise “xor” operation. If the existing value in the database is not present or shorter than param, it is first extended to the length of param with zero bytes. If param is shorter than the existing value in the database, the existing value is truncated to match the length of param.

Transaction.compare_and_clear(key, param)

Performs an atomic compare and clear operation. If the existing value in the database is equal to the given value, then given key is cleared.

Transaction.max(key, param)

Sets the value in the database to the larger of the existing value and param. If the existing value in the database is not present or shorter than param, it is first extended to the length of param with zero bytes. If param is shorter than the existing value in the database, the existing value is truncated to match the length of param.

Both the existing value and param are treated as unsigned integers. (This differs from the behavior of atomic addition.)

Transaction.byte_max(key, param)

Performs lexicographic comparison of byte strings. If the existing value in the database is not present, then param is stored. Otherwise the larger of the two values is then stored in the database.

Transaction.min(key, param)

Sets the value in the database to the smaller of the existing value and param. If the existing value in the database is not present, then param is stored in the database. If the existing value in the database is shorter than param, it is first extended to the length of param with zero bytes. If param is shorter than the existing value in the database, the existing value is truncated to match the length of param.

Both the existing value and param are treated as unsigned integers. (This differs from the behavior of atomic addition.)

Transaction.byte_min(key, param)

Performs lexicographic comparison of byte strings. If the existing value in the database is not present, then param is stored. Otherwise the smaller of the two values is then stored in the database.

Transaction.set_versionstamped_key(key, param)

Transforms key using a versionstamp for the transaction. This key must be at least 14 bytes long. The final 4 bytes will be interpreted as a 32-bit little-endian integer denoting an index into the key at which to perform the transformation, and then trimmed off the key. The 10 bytes in the key beginning at the index will be overwritten with the versionstamp. If the index plus 10 bytes points past the end of the key, the result will be an error. Sets the transformed key in the database to param.

A versionstamp is a 10 byte, unique, monotonically (but not sequentially) increasing value for each committed transaction. The first 8 bytes are the committed version of the database (serialized in big-endian order). The last 2 bytes are monotonic in the serialization order for transactions (serialized in big-endian order).

A transaction is not permitted to read any transformed key or value previously set within that transaction, and an attempt to do so will result in an accessed_unreadable error. The range of keys marked unreadable when setting a versionstamped key begins at the transactions’s read version if it is known, otherwise a versionstamp of all 0x00 bytes is conservatively assumed. The upper bound of the unreadable range is a versionstamp of all 0xFF bytes.

Warning

At this time, versionstamped keys are not compatible with the Tuple layer except in Java, Python, and Go. Note that this implies versionstamped keys may not be used with the Subspace and Directory layers except in those languages.

Transaction.set_versionstamped_value(key, param)

Transforms param using a versionstamp for the transaction. This parameter must be at least 14 bytes long. The final 4 bytes will be interpreted as a 32-bit little-endian integer denoting an index into the parameter at which to perform the transformation, and then trimmed off the key. The 10 bytes in the parameter beginning at the index will be overwritten with the versionstamp. If the index plus 10 bytes points past the end of the parameter, the result will be an error. Sets key in the database to the transformed parameter.

A versionstamp is a 10 byte, unique, monotonically (but not sequentially) increasing value for each committed transaction. The first 8 bytes are the committed version of the database (serialized in big-endian order). The last 2 bytes are monotonic in the serialization order for transactions (serialized in big-endian order).

A transaction is not permitted to read any transformed key or value previously set within that transaction, and an attempt to do so will result in an accessed_unreadable error. The range of keys marked unreadable when setting a versionstamped key begins at the transactions’s read version if it is known, otherwise a versionstamp of all 0x00 bytes is conservatively assumed. The upper bound of the unreadable range is a versionstamp of all 0xFF bytes.

Warning

At this time, versionstamped values are not compatible with the Tuple layer except in Java, Python, and Go. Note that this implies versionstamped values may not be used with the Subspace and Directory layers except in those languages.

Committing

Transaction.commit()

Attempt to commit the changes made in the transaction to the database. Returns a FutureVoid representing the asynchronous result of the commit. You must call the Future.wait() method on the returned FutureVoid, which will raise an exception if the commit failed.

As with other client/server databases, in some failure scenarios a client may be unable to determine whether a transaction succeeded. In these cases, Transaction.commit() will raise a commit_unknown_result exception. The Transaction.on_error() function treats this exception as retryable, so retry loops that don’t check for commit_unknown_result could execute the transaction twice. In these cases, you must consider the idempotence of the transaction. For more information, see Transactions with unknown results.

Normally, commit will wait for outstanding reads to return. However, if those reads were snapshot reads or the transaction option for disabling “read-your-writes” has been invoked, any outstanding reads will immediately return errors.

Note

Consider using the @fdb.transactional decorator, which not only calls Database.create_transaction() or :meth`Tenant.create_transaction` and Transaction.commit() for you but also implements the required error handling and retry logic for transactions.

Warning

If any operation is performed on a transaction after a commit has been issued but before it has returned, both the commit and the operation will raise a used_during_commit exception. In this case, all subsequent operations on this transaction will raise this error until reset is called.

Transaction.on_error(exception)

Determine whether an exception raised by a Transaction method is retryable. Returns a FutureVoid. You must call the Future.wait() method on the FutureVoid, which will return after a delay if the exception was retryable, or re-raise the exception if it was not.

Note

Consider using the @fdb.transactional decorator, which calls this method for you.

Transaction.reset()

Rollback a transaction, completely resetting it to its initial state. This is logically equivalent to destroying the transaction and creating a new one.

Transaction.cancel()

Cancels the transaction. All pending or future uses of the transaction will raise a transaction_cancelled exception. The transaction can be used again after it is reset.

Warning

Be careful if you are using Transaction.reset() and Transaction.cancel() concurrently with the same transaction. Since they negate each other’s effects, a race condition between these calls will leave the transaction in an unknown state.

Warning

If your program attempts to cancel a transaction after Transaction.commit() has been called but before it returns, unpredictable behavior will result. While it is guaranteed that the transaction will eventually end up in a cancelled state, the commit may or may not occur. Moreover, even if the call to Transaction.commit() appears to raise a transaction_cancelled exception, the commit may have occurred or may occur in the future. This can make it more difficult to reason about the order in which transactions occur.

Watches

Transaction.watch(key)

Creates a watch and returns a FutureVoid that will become ready when the watch reports a change to the value of the specified key.

A watch’s behavior is relative to the transaction that created it. A watch will report a change in relation to the key’s value as readable by that transaction. The initial value used for comparison is either that of the transaction’s read version or the value as modified by the transaction itself prior to the creation of the watch. If the value changes and then changes back to its initial value, the watch might not report the change.

Until the transaction that created it has been committed, a watch will not report changes made by other transactions. In contrast, a watch will immediately report changes made by the transaction itself. Watches cannot be created if the transaction has set Transaction.options.set_read_your_writes_disable(), and an attempt to do so will raise an watches_disabled exception.

If the transaction used to create a watch encounters an exception during commit, then the watch will be set with that exception. A transaction whose commit result is unknown will set all of its watches with the commit_unknown_result exception. If an uncommitted transaction is reset or destroyed, then any watches it created will be set with the transaction_cancelled exception.

By default, each database connection can have no more than 10,000 watches that have not yet reported a change. When this number is exceeded, an attempt to create a watch will raise a too_many_watches exception. This limit can be changed using Database.options.set_max_watches(). Because a watch outlives the transaction that creates it, any watch that is no longer needed should be cancelled by calling Future.cancel() on its returned future.

Conflict ranges

Note

Most applications will use the strictly serializable isolation that transactions provide by default and will not need to manipulate conflict ranges.

The following make it possible to add conflict ranges to a transaction.

Transaction.add_read_conflict_range(begin, end)

Adds a range of keys to the transaction’s read conflict ranges as if you had read the range. As a result, other transactions that write a key in this range could cause the transaction to fail with a conflict.

Transaction.add_read_conflict_key(key)

Adds a key to the transaction’s read conflict ranges as if you had read the key. As a result, other transactions that concurrently write this key could cause the transaction to fail with a conflict.

Transaction.add_write_conflict_range(begin, end)

Adds a range of keys to the transaction’s write conflict ranges as if you had cleared the range. As a result, other transactions that concurrently read a key in this range could fail with a conflict.

Transaction.add_write_conflict_key(key)

Adds a key to the transaction’s write conflict ranges as if you had written the key. As a result, other transactions that concurrently read this key could fail with a conflict.

Versions

Most applications should use the read version that FoundationDB determines automatically during the transaction’s first read, and ignore all of these methods.

Transaction.set_read_version(version)

Infrequently used. Sets the database version that the transaction will read from the database. The database cannot guarantee causal consistency if this method is used (the transaction’s reads will be causally consistent only if the provided read version has that property).

Transaction.get_read_version()

Infrequently used. Returns a FutureVersion representing the transaction’s (future) read version. You must call the Future.wait() method on the returned object to retrieve the version as an integer.

Transaction.get_committed_version()

Infrequently used. Gets the version number at which a successful commit modified the database. This must be called only after the successful (non-error) completion of a call to Transaction.commit() on this Transaction, or the behavior is undefined. Read-only transactions do not modify the database when committed and will have a committed version of -1. Keep in mind that a transaction which reads keys and then sets them to their current values may be optimized to a read-only transaction.

Transaction.get_versionstamp()

Infrequently used. Returns a future which will contain the versionstamp which was used by any versionstamp operations in this transaction. This function must be called before a call to Transaction.commit() on this Transaction. The future will be ready only after the successful completion of a call to Transaction.commit() on this Transaction. Read-only transactions do not modify the database when committed and will result in the future completing with an error. Keep in mind that a transaction which reads keys and then sets them to their current values may be optimized to a read-only transaction.

Transaction misc functions

Transaction.get_estimated_range_size_bytes(begin_key, end_key)

Gets the estimated byte size of the given key range. Returns a FutureInt64.

Note

The estimated size is calculated based on the sampling done by FDB server. The sampling algorithm works roughly in this way: the larger the key-value pair is, the more likely it would be sampled and the more accurate its sampled size would be. And due to that reason it is recommended to use this API to query against large ranges for accuracy considerations. For a rough reference, if the returned size is larger than 3MB, one can consider the size to be accurate.

Transaction.get_range_split_points(self, begin_key, end_key, chunk_size)

Gets a list of keys that can split the given range into (roughly) equally sized chunks based on chunk_size. Returns a FutureKeyArray. .. note:: The returned split points contain the start key and end key of the given range

Transaction.get_approximate_size()

Gets the the approximate transaction size so far, which is the summation of the estimated size of mutations, read conflict ranges, and write conflict ranges. Returns a FutureInt64.

Transaction options

Transaction options alter the behavior of FoundationDB transactions. FoundationDB defaults to extremely safe transaction behavior, and we have worked hard to make the performance excellent with the default setting, so you should not often need to use transaction options.

Transaction.options.set_snapshot_ryw_disable()

If this option is set more times in this transaction than the enable option, snapshot reads will not see the effects of prior writes in the same transaction. Note that prior to API version 300, this was the default behavior. This option can be disabled one or more times at the database level by calling Database.options.set_snapshot_ryw_disable().

Transaction.options.set_snapshot_ryw_enable()

If this option is set an equal or more times in this transaction than the disable option, snapshot reads will see the effects of prior writes in the same transaction. This option can be enabled one or more times at the database-level by calling Database.options.set_snapshot_ryw_enable().

Transaction.options.set_priority_batch()

This transaction should be treated as low priority (other transactions will be processed first). Batch priority transactions will also be throttled at load levels smaller than for other types of transactions and may be fully cut off in the event of machine failures. Useful for doing potentially saturating batch work without interfering with the latency of other operations.

Transaction.options.set_priority_system_immediate()

This transaction should be treated as extremely high priority, taking priority over other transactions and bypassing controls on transaction queuing.

Warning

This is intended for the use of internal database functions and low-level tools; use by applications may result in severe database performance or availability problems.

Transaction.options.set_causal_read_risky()

This transaction does not require the strict causal consistency guarantee that FoundationDB provides by default. The read version will be committed, and usually will be the latest committed, but might not be the latest committed in the event of a simultaneous fault and misbehaving clock. One can set this for all transactions by calling Database.options.set_transaction_causal_read_risky().

Transaction.options.set_causal_write_risky()

The application either knows that this transaction will be self-conflicting (at least one read overlaps at least one set or clear), or is willing to accept a small risk that the transaction could be committed a second time after its commit apparently succeeds. This option provides a small performance benefit.

Transaction.options.set_next_write_no_write_conflict_range()

The next write performed on this transaction will not generate a write conflict range. As a result, other transactions which read the key(s) being modified by the next write will not necessarily conflict with this transaction.

Note

Care needs to be taken when using this option on a transaction that is shared between multiple threads. When setting this option, write conflict ranges will be disabled on the next write operation, regardless of what thread it is on.

Transaction.options.set_read_your_writes_disable()

When this option is invoked, a read performed by a transaction will not see any prior mutations that occured in that transaction, instead seeing the value which was in the database at the transaction’s read version. This option may provide a small performance benefit for the client, but also disables a number of client-side optimizations which are beneficial for transactions which tend to read and write the same keys within a single transaction.

Note

It is an error to set this option after performing any reads or writes on the transaction.

Transaction.options.set_read_ahead_disable()

Disables read-ahead caching for range reads. Under normal operation, a transaction will read extra rows from the database into cache if range reads are used to page through a series of data one row at a time (i.e. if a range read with a one row limit is followed by another one row range read starting immediately after the result of the first).

Transaction.options.set_access_system_keys()

Allows this transaction to read and modify system keys (those that start with the byte 0xFF).

Warning

Writing into system keys will likely break your database. Further, even for readers, the format of data in the system keys may change from version to version in FoundationDB.

Transaction.options.set_read_system_keys()

Allows this transaction to read system keys (those that start with the byte 0xFF).

Warning

The format of data in the system keys may change from version to version in FoundationDB.

Transaction.options.set_retry_limit()

Set a maximum number of retries after which additional calls to Transaction.on_error() will throw the most recently seen error code. (By default, a transaction permits an unlimited number of retries.) Valid parameter values are [-1, INT_MAX]. If set to -1, the transaction returns to the default of unlimited retries.

Prior to API version 610, Like all other transaction options, the retry limit must be reset after a call to Transaction.on_error(). If the API version is 610 or newer, then the retry limit is not reset. Note that at all API versions, it is safe and legal to call this option after each call to Transaction.on_error(), so most code written assuming the older behavior can be upgraded without requiring any modification. This also means there is no need to introduce logic to conditionally set this option within retry loops. One can also set the default retry limit for all transactions by calling Database.options.set_transaction_retry_limit().

Transaction.options.set_max_retry_delay()

Set the maximum backoff delay incurred in the call to Transaction.on_error() if the error is retryable. Prior to API version 610, like all other transaction options, the maximum retry delay must be reset after a call to Transaction.on_error(). If the API version is 610 or newer, then the maximum retry delay is not reset. Note that at all API versions, it is safe and legal to call this option after each call to Transaction.on_error(), so most cade written assuming the older behavior can be upgraded without requiring any modification. This also means there is no need to introduce logic to conditionally set this option within retry loops. One can set the default retry limit for all transactions by calling Database.options.set_transaction_max_retry_delay().

Transaction.options.set_size_limit()

Set the transaction size limit in bytes. The size is calculated by combining the sizes of all keys and values written or mutated, all key ranges cleared, and all read and write conflict ranges. (In other words, it includes the total size of all data included in the request to the cluster to commit the transaction.) Large transactions can cause performance problems on FoundationDB clusters, so setting this limit to a smaller value than the default can help prevent the client from accidentally degrading the cluster’s performance. This value must be at least 32 and cannot be set to higher than 10,000,000, the default transaction size limit.

Transaction.options.set_timeout()

Set a timeout duration in milliseconds after which the transaction automatically to be cancelled. The time is measured from transaction creation (or the most call to reset, if any). Valid parameter values are [0, INT_MAX]. If set to 0, all timeouts will be disabled. Once a transaction has timed out, all pending or future uses of the transaction will raise a transaction_timed_out exception. The transaction can be used again after it is reset.

Timeouts employ transaction cancellation, so you should note the issues raised by Transaction.cancel() when using timeouts.

Prior to API version 610, like all other transaction options, a timeout must be reset after a call to Transaction.on_error(). Note that resetting this option resets only the timeout duration, not the starting point from which the time is measured. If the API version is 610 or newer, then the timeout is not reset. This allows the user to specify a timeout for specific transactions that is longer than the timeout specified by Database.options.set_transaction_timeout(). Note that at all API versions, it is safe and legal to call this option after each call to Transaction.on_error(), so most code written assuming the older behavior can be upgraded without requiring any modification. This also means that there is no need to introduce logic to conditionally set this option within retry loops. One can set the default timeout for all transactions by calling Database.options.set_transaction_timeout().

Transaction.options.set_transaction_logging_max_field_length(size_limit)

Sets the maximum escaped length of key and value fields to be logged to the trace file via the LOG_TRANSACTION option, after which the field will be truncated. A negative value disables truncation. One can set the default max field length for all transactions by calling Database.options.set_transaction_logging_max_field_length().

Transaction.options.set_debug_transaction_identifier(id_string)

Sets a client provided string identifier for the transaction that will be used in scenarios like tracing or profiling. Client trace logging or transaction profiling must be separately enabled.

Transaction.options.set_log_transaction()

Enables tracing for this transaction and logs results to the client trace logs. The DEBUG_TRANSACTION_IDENTIFIER option must be set before using this option, and client trace logging must be enabled to get log output.

Future objects

Many FoundationDB API functions return “future” objects. A brief overview of futures is included in the class scheduling tutorial. Most future objects behave just like a normal object, but block when you use them for the first time if the asynchronous function which returned the future has not yet completed its action. A future object is considered ready when either a value is available, or when an error has occurred.

When a future object “blocks”, what actually happens is determined by the event model. A threaded program will block a thread, but a program using the gevent model will block a greenlet.

All future objects are a subclass of the Future type.

class fdb.Future
Future.wait()

Blocks until the object is ready, and returns the object value (or raises an exception if the asynchronous function failed).

Future.is_ready()

Immediately returns true if the future object is ready, false otherwise.

Future.block_until_ready()

Blocks until the future object is ready.

Future.on_ready(callback)

Calls the specified callback function, passing itself as a single argument, when the future object is ready. If the future object is ready at the time on_ready() is called, the call may occur immediately in the current thread (although this behavior is not guaranteed). Otherwise, the call may be delayed and take place on the thread with which the client was initialized. Therefore, the callback is responsible for any needed thread synchronization (and/or for posting work to your application’s event loop, thread pool, etc., as may be required by your application’s architecture).

Note

This function guarantees the callback will be executed at most once.

Warning

There are a number of requirements and constraints to be aware of when using callbacks with FoundationDB. Please read Programming with futures.

Future.cancel()

Cancels a future and its associated asynchronous operation. If called before the future is ready, attempts to access its value will raise an operation_cancelled exception. Cancelling a future which is already ready has no effect. Note that even if a future is not ready, its associated asynchronous operation may have succesfully completed and be unable to be cancelled.

static Future.wait_for_any(*futures)

Does not return until at least one of the given future objects is ready. Returns the index in the parameter list of a ready future object.

Asynchronous methods return one of the following subclasses of Future:

class fdb.Value

Represents a future string object and responds to the same methods as string in Python. They may be passed to FoundationDB methods that expect a string.

Value.present()

Returns False if the key used to request this value was not present in the database. For example:

@fdb.transactional
def foo(tr):
    val = tr[b'foo']
    if val.present():
        print 'Got value: %s' % val
    else:
        print 'foo was not present'
class fdb.Key

Represents a future string object and responds to the same methods as string in Python. They may be passed to FoundationDB methods that expect a string.

class fdb.FutureInt64

Represents a future integer. You must call the Future.wait() method on this object to retrieve the integer.

class fdb.FutureStringArray

Represents a future list of strings. You must call the Future.wait() method on this object to retrieve the list of strings.

class fdb.FutureVoid

Represents a future returned from asynchronous methods that logically have no return value.

For a FutureVoid object returned by Transaction.commit() or Transaction.on_error(), you must call the Future.wait() method, which will either raise an exception if an error occurred during the asynchronous call, or do nothing and return None.

Streaming modes

fdb.StreamingMode

When using Transaction.get_range() and similar interfaces, API clients can request large ranges of the database to iterate over. Making such a request doesn’t necessarily mean that the client will consume all of the data in the range - sometimes the client doesn’t know how far it intends to iterate in advance. FoundationDB tries to balance latency and bandwidth by requesting data for iteration in batches.

Streaming modes permit the API client to customize this performance tradeoff by providing extra information about how the iterator will be used.

The following streaming modes are available:

StreamingMode.iterator

The default. The client doesn’t know how much of the range it is likely to used and wants different performance concerns to be balanced.

Only a small portion of data is transferred to the client initially (in order to minimize costs if the client doesn’t read the entire range), and as the caller iterates over more items in the range larger batches will be transferred in order to maximize throughput.

StreamingMode.want_all

The client intends to consume the entire range and would like it all transferred as early as possible.

StreamingMode.small

Infrequently used. Transfer data in batches small enough to not be much more expensive than reading individual rows, to minimize cost if iteration stops early.

StreamingMode.medium

Infrequently used. Transfer data in batches sized in between small and large.

StreamingMode.large

Infrequently used. Transfer data in batches large enough to be, in a high-concurrency environment, nearly as efficient as possible. If the client stops iteration early, some disk and network bandwidth may be wasted. The batch size may still be too small to allow a single client to get high throughput from the database, so if that is what you need consider StreamingMode.serial.

StreamingMode.serial

Transfer data in batches large enough that an individual client can get reasonable read bandwidth from the database. If the client stops iteration early, considerable disk and network bandwidth may be wasted.

StreamingMode.exact

Infrequently used. The client has passed a specific row limit and wants that many rows delivered in a single batch. This is not particularly useful in Python because iterator functionality makes batches of data transparent, so use StreamingMode.want_all instead.

Event models

By default, the FoundationDB Python API assumes that the calling program uses threads (as provided by the threading module) for concurrency. This means that blocking operations will block the current Python thread. This behavior can be changed by specifying the optional event_model parameter to the open() function.

The following event models are available:

event_model=None

The default. Blocking operations will block the current Python thread. This is also fine for programs without any form of concurrency.

event_model="gevent"

The calling program uses the gevent module for single-threaded concurrency. Blocking operations will block the current greenlet.

The FoundationDB Python API has been tested with gevent versions 0.13.8 and 1.0rc2 and should work with all gevent 0.13 and 1.0 releases.

Note

The gevent event model on Windows requires gevent 1.0 or newer.

event_model="debug"

The calling program is threaded, but needs to be interruptible (by Ctrl-C). Blocking operations will poll, effectively blocking the current thread but responding to keyboard interrupts. This model is inefficient, but can be very useful for debugging.

Errors

Errors in the FoundationDB API are raised as exceptions of type FDBError. These errors may be displayed for diagnostic purposes, but generally should be passed to Transaction.on_error(). When using @fdb.transactional, appropriate errors will be retried automatically.

class fdb.FDBError
FDBError.code

An integer associated with the error type.

FDBError.description

A somewhat human-readable description of the error.

Warning

You should use only FDBError.code for programmatic comparisons, as the description of the error may change at any time. Whenever possible, use the Transaction.on_error() method to handle FDBError exceptions.

Tuple layer

The FoundationDB API comes with a built-in layer for encoding tuples into keys usable by FoundationDB. The encoded key maintains the same sort order as the original tuple: sorted first by the first element, then by the second element, etc. This makes the tuple layer ideal for building a variety of higher-level data models.

Note

For general guidance on tuple usage, see the discussion in the document on Data Modeling.

The tuple layer in the FoundationDB Python API supports tuples that contain elements of the following data types:

Type

Legal Values

None

Any value such that value == None

Byte string

Any value such that isinstance(value, bytes)

Unicode string

Any value such that isinstance(value, unicode)

Integer

Python 2.7: Any value such that isinstance(value, (int,long)) and -2**2040+1 <= value <= 2**2040-1. Python 3.x: Any value such that isinstance(value, int) and -2**2040+1 <= value <= 2**2040-1.

Floating point number (single-precision)

Any value such that isinstance(value, fdb.tuple.SingleFloat) or isinstance(value, ctypes.c_float)

Floating point number (double-precision)

Any value such that isinstance(value, (ctypes.c_double, float))

Boolean

Any value such that isinstance(value, Boolean)

UUID

Any value such that isinstance(value, uuid.UUID)

Versionstamp

Any value such that isinstance(value, fdb.tuple.Versionstamp)

Tuple or List

Any value such that isinstance(value, (tuple, list)) and each element within value is one of the supported types with a legal value.

If T is a Python tuple meeting these criteria, then:

fdb.tuple.compare(T, fdb.tuple.unpack(fdb.tuple.pack(T))) == 0

That is, any tuple meeting these criteria will have the same semantic value if serialized and deserialized. For the most part, this also implies that T == fdb.tuple.unpack(fdb.tuple.pack(T)) with the following caveats:

  • Any value of type ctypes.c_double is converted to the Python float type, but value.value == fdb.tuple.unpack(fdb.tuple.pack((value,)))[0] will be true (as long as value is not NaN).

  • Any value of type ctypes.c_float is converted into a fdb.tuple.SingleFloat instance, but value.value == fdb.tuple.unpack(fdb.tuple.pack((value,)))[0].value will be true (as long as value.value is not NaN).

  • Any value of type list or tuple is converted to a tuple type where the elements of the serialized and deserialized value will be equal (subject to these caveats) to the elements of the original value.

import fdb.tuple

Imports the FoundationDB tuple layer.

fdb.tuple.pack(tuple, prefix=b'')

Returns a key (byte string) encoding the specified tuple. If prefix is set, it will prefix the serialized bytes with the prefix string. This throws an error if any of the tuple’s items are incomplete Versionstamp instances.

fdb.tuple.pack_with_versionstamp(tuple, prefix=b'')

Returns a key (byte string) encoding the specified tuple. This method will throw an error unless exactly one of the items of the tuple is an incomplete Versionstamp instance. (It will recurse down nested tuples if there are any to find one.) If so, it will produce a byte string that can be fed into fdb.Transaction.set_versionstamped_key() and correctly fill in the versionstamp information at commit time so that when the key is re-read and deserialized, the only difference is that the Versionstamp instance is complete and has the transaction version filled in. This throws an error if there are no incomplete Versionstamp instances in the tuple or if there is more than one.

fdb.tuple.unpack(key)

Returns the tuple encoded by the given key.

fdb.tuple.has_incomplete_versionstamp(tuple)

Returns True if there is at least one element contained within the tuple that is a Versionstamp instance that is incomplete. If there are multiple incomplete Versionstamp instances, this method will return True, but trying to pack it into a byte string will result in an error.

fdb.tuple.range(tuple)

Returns a Python slice object representing all keys that encode tuples strictly starting with tuple (that is, all tuples of greater length than tuple of which tuple is a prefix).

Can be used to directly index a Transaction object to retrieve a range. For example:

tr[ fdb.tuple.range(('A',2)) ]

returns all key-value pairs in the database whose keys would unpack to tuples like (‘A’, 2, x), (‘A’, 2, x, y), etc.

fdb.tuple.compare(tuple1, tuple2)

Compares two tuples in a way that respects the natural ordering of the elements within the tuples. It will return -1 if tuple1 would sort before tuple2 when performing an element-wise comparison of the two tuples, it will return 1 if tuple1 would sort after tuple2, and it will return 0 if the two tuples are equivalent. If the function must compare two elements of different types while doing the comparison, it will sort the elements based on their internal type codes, so comparisons are consistent if not necessarily semantically meaningful. Strings are sorted on their byte representation when encoded into UTF-8 (which may differ from the default sort when non-ASCII characters are included within the string), and UUIDs are sorted based on their big-endian byte representation. Single-precision floating point numbers are sorted before all double-precision floating point numbers, and for floating point numbers, -NaN is sorted before -Infinity which is sorted before finite numbers which are sorted before Infinity which is sorted before NaN. Different representations of NaN are not treated as equal.

Additionally, the tuple serialization contract is such that after they are serialized, the byte-string representations of tuple1 and tuple2 will sort in a manner that is consistent with this function. In particular, this function obeys the following contract:

fdb.tuple.compare(tuple1, tuple2) == -1 if fdb.tuple.pack(tuple1) < fdb.tuple.pack(tuple2) else \
                                      0 if fdb.tuple.pack(tuple2) == fdb.tuple.pack(tuple2) else 1

As byte order is the comparator used within the database, this comparator can be used to determine the order of keys within the database.

class fdb.tuple.SingleFloat(value)

Wrapper around a single-precision floating point value. When constructed, the value parameter should either be an integral value, a float, or a ctypes.c_float. It will then properly store the value in its SingleFloat.value field (which should not be mutated). If the float does not fit within a IEEE 754 floating point integer, there may be a loss of precision.

SingleFloat.value

The underlying value of the SingleFloat object. This will have type float.

SingleFloat.__eq__(other)
SingleFloat.__ne__(other)
SingleFloat.__lt__(other)
SingleFloat.__le__(other)
SingleFloat.__gt__(other)
SingleFloat.__ge__(other)

Comparison functions for SingleFloat objects. This will sort according to the byte representation of the object rather than using standard float comparison. In particular, this means that -0.0 != 0.0 and that the NaN values will sort in a way that is consistent with the compare() method between tuples rather than using standard floating-point comparison.

class fdb.tuple.Versionstamp(tr_version=None, user_version=0)

Used to represent values written by versionstamp operations within the tuple layer. This wraps a single byte array of length 12 that can be used to represent some global order of items within the database. These versions are composed of two separate components: (1) the 10-byte tr_version and (2) the two-byte user_version. The tr_version is set by the database, and it is used to impose an order between different transactions. This order is guaranteed to be monotonically increasing over time for a given database. (In particular, it imposes an order that is consistent with a serialization order of the database’s transactions.) If the client elects to leave the tr_version as its default value of None, then the Versionstamp is considered “incomplete”. This will cause the first 10 bytes of the serialized Versionstamp to be filled in with dummy bytes when serialized. When used with fdb.Transaction.set_versionstamped_key(), an incomplete version can be used to ensure that a key gets written with the current transaction’s version which can be useful for maintaining append-only data structures within the database. If the tr_version is set to something that is not None, it should be set to a byte array of length 10. In this case, the Versionstamp is considered “complete”. This is the usual case when one reads a serialized Versionstamp from the database.

The user_version should be specified as an integer, but it must fit within a two-byte unsigned integer. It is set by the client, and it is used to impose an order between items serialized within a single transaction. If left unset, then final two bytes of the serialized Versionstamp are filled in with a default (constant) value.

Sample usage of this class might be something like this:

@fdb.transactional
def write_versionstamp(tr, prefix):
    tr.set_versionstamped_key(fdb.tuple.pack_with_versionstamp((prefix, fdb.tuple.Versionstamp())), b'')
    return tr.get_versionstamp()

@fdb.transactional
def read_versionstamp(tr, prefix):
    subspace = fdb.Subspace((prefix,))
    for k, _ in tr.get_range(subspace.range().start, subspace.range().stop, 1):
        return subspace.unpack(k)[0]
    return None

db = fdb.open()
del db[fdb.tuple.range(('prefix',))]
tr_version = write_versionstamp(db, 'prefix').wait()
v = read_versionstamp(db, 'prefix')
assert v == fdb.tuple.Versionstamp(tr_version=tr_version)

Here, we serialize an incomplete Versionstamp and then write it using the set_versionstamped_key mutation so that it picks up the transaction’s version information. Then when we read it back, we get a complete Versionstamp with the committed transaction’s version.

Versionstamp.tr_version

The inter-transaction component of the Versionstamp class. It should be either None (to indicate an incomplete Versionstamp that will set the version later) or to some 10 byte value indicating the commit version and batch version of some transaction.

Versionstamp.user_version

The intra-transaction component of the Versionstamp class. It should be some number that can fit within two bytes (i.e., between 0 and 65,535 inclusive). It can be used to impose an order between items that are committed together in the same transaction. If left unset, then the versionstamp is assigned a (constant) default user version value.

Versionstamp.from_bytes(bytes)

Static initializer for Versionstamp instances that takes a serialized Versionstamp and creates an instance of the class. The bytes parameter should be a byte string of length 12. This method will serialize the version as a “complete” Versionstamp unless the dummy bytes are equal to the default transaction version assigned to incomplete Versionstamps.

Versionstamp.is_complete()

Returns whether this version has been given a (non-None) tr_version or not.

Versionstamp.completed(tr_version)

If this Versionstamp is incomplete, this returns a copy of this instance except that the tr_version is filled in with the passed parameter. If the Versionstamp is already complete, it will raise an error.

Versionstamp.to_bytes()

Produces a serialized byte string corresponding to this versionstamp. It will have length 12 and will combine the tr_version and user_version to produce a byte string that lexicographically sorts appropriately with other Versionstamp instances. If this instance is incomplete, then the tr_version component gets filled in with dummy bytes that will cause it to sort after every complete Verionstamp’s serialized bytes.

Versionstamp.__eq__(other)
Versionstamp.__ne__(other)
Versionstamp.__lt__(other)
Versionstamp.__le__(other)
Versionstamp.__gt__(other)
Versionstamp.__ge__(other)

Comparison functions for Versionstamp objects. For two complete Versionstamps, the ordering is first lexicographically by tr_version and then by user_version. Incomplete Versionstamps are defined to sort after all complete Versionstamps (the idea being that for a given transaction, if a Versionstamp has been created as the result of some prior transaction’s work, then the incomplete Versionstamp, when assigned a version, will be assigned a greater version than the existing one), and for two incomplete Versionstamps, the order is by user_version only.

Subspaces

Subspaces provide a convenient way to use the tuple layer to define namespaces for different categories of data. The namespace is specified by a prefix tuple which is prepended to all tuples packed by the subspace. When unpacking a key with the subspace, the prefix tuple will be removed from the result.

As a best practice, API clients should use at least one subspace for application data.

Note

For general guidance on subspace usage, see the discussion in the Developer Guide.

class fdb.Subspace(prefixTuple=tuple(), rawPrefix='')

Creates a subspace with the specified prefix tuple. If the raw prefix byte string is specified, then it will be prepended to all packed keys. Likewise, the raw prefix will be removed from all unpacked keys.

Subspace.key()

Returns the key encoding the prefix used for the subspace. This is equivalent to packing the empty tuple.

Subspace.pack(tuple=tuple())

Returns the key encoding the specified tuple in the subspace. For example, if you have a subspace with prefix tuple ('users') and you use it to pack the tuple ('Smith'), the result is the same as if you packed the tuple ('users', 'Smith') with the tuple layer.

Subspace.pack_with_versionstamp(tuple)

Returns the key encoding the specified tuple in the subspace so that it may be used as the key in the fdb.Transaction.set_versionstampe_key() method. The passed tuple must contain exactly one incomplete fdb.tuple.Versionstamp instance or the method will raise an error. The behavior here is the same as if one used the fdb.tuple.pack_with_versionstamp() method to appropriately pack together this subspace and the passed tuple.

Subspace.unpack(key)

Returns the tuple encoded by the given key, with the subspace’s prefix tuple and raw prefix removed.

Subspace.range(tuple=tuple())

Returns a range representing all keys in the subspace that encode tuples strictly starting with the specifed tuple.

The range will be returned as a Python slice object, and may be used with any FoundationDB methods that require a range:

r = subspace.range(('A', 2))
rng_itr1 = tr[r]
rng_itr2 = tr.get_range(r.start, r.stop, limit=1)
Subspace.contains(key)

Returns true if key starts with Subspace.key(), indicating that the subspace logically contains key.

Subspace.as_foundationdb_key()

Returns the key encoding the prefix used for the subspace, like Subspace.key(). This method serves to support the as_foundationdb_key() convenience interface.

Subspace.subspace(tuple)

Returns a new subspace which is equivalent to this subspace with its prefix tuple extended by the specified tuple.

x = subspace[item]

Shorthand for x = subspace.subspace((item,)). This function can be combined with the Subspace.as_foundationdb_key() convenience to turn this:

s = fdb.Subspace(('x',))
tr[s.pack(('foo', 'bar', 1))] = ''

into this:

s = fdb.Subspace(('x',))
tr[s['foo']['bar'][1]] = ''

Directories

The FoundationDB API provides directories as a tool for managing related subspaces. Directories are a recommended approach for administering applications. Each application should create or open at least one directory to manage its subspaces.

Note

For general guidance on directory usage, see the discussion in the Developer Guide.

Directories are identified by hierarchical paths analogous to the paths in a Unix-like file system. A path is represented as a tuple of strings. Each directory has an associated subspace used to store its content. The directory layer maps each path to a short prefix used for the corresponding subspace. In effect, directories provide a level of indirection for access to subspaces.

Except where noted, directory methods interpret the provided path(s) relative to the path of the directory object. When opening a directory, a byte string layer option may be specified as a metadata identifier.

fdb.directory

The default instance of DirectoryLayer.

class fdb.DirectoryLayer(node_subspace=Subspace(rawPrefix='þ'), content_subspace=Subspace(), allow_manual_prefixes=False)

Each instance defines a new root directory. The subspaces node_subspace and content_subspace control where the directory metadata and contents, respectively, are stored. The default root directory has a node_subspace with raw prefix \xFE and a content_subspace with no prefix. Specifying more restrictive values for node_subspace and content_subspace will allow using the directory layer alongside other content in a database. If allow_manual_prefixes is false, attempts to create a directory with a manual prefix under the directory layer will raise an exception. The default root directory does not allow manual prefixes.

DirectoryLayer.create_or_open(tr, path, layer=None)

Opens the directory with path specified as a tuple of strings. path can also be a string, in which case it will be automatically wrapped in a tuple. All string values in a path will be converted to unicode. If the directory does not exist, it is created (creating parent directories if necessary).

If the byte string layer is specified and the directory is new, it is recorded as the layer; if layer is specified and the directory already exists, it is compared against the layer specified when the directory was created, and the method will raise an exception if they differ.

Returns the directory and its contents as a DirectorySubspace.

DirectoryLayer.open(tr, path, layer=None)

Opens the directory with path specified as a tuple of strings. path can also be a string, in which case it will be automatically wrapped in a tuple. All string values in a path will be converted to unicode. The method will raise an exception if the directory does not exist.

If the byte string layer is specified, it is compared against the layer specified when the directory was created, and the method will raise an exception if they differ.

Returns the directory and its contents as a DirectorySubspace.

DirectoryLayer.create(tr, path, layer=None, prefix=None)

Creates a directory with path specified as a tuple of strings. path can also be a string, in which case it will be automatically wrapped in a tuple. All string values in a path will be converted to unicode. Parent directories are created if necessary. The method will raise an exception if the given directory already exists.

If the byte string prefix is specified, the directory is created with the given physical prefix; otherwise a prefix is allocated automatically.

If the byte string layer is specified, it is recorded with the directory and will be checked by future calls to open.

Returns the directory and its contents as a DirectorySubspace.

DirectoryLayer.move(tr, old_path, new_path)

Moves the directory at old_path to new_path. There is no effect on the physical prefix of the given directory or on clients that already have the directory open. The method will raise an exception if a directory does not exist at old_path, a directory already exists at new_path, or the parent directory of new_path does not exist.

Returns the directory at its new location as a DirectorySubspace.

DirectoryLayer.remove(tr, path)

Removes the directory at path, its contents, and all subdirectories. The method will raise an exception if the directory does not exist.

Warning

Clients that have already opened the directory might still insert data into its contents after removal.

DirectoryLayer.remove_if_exists(tr, path)

Checks if the directory at path exists and, if so, removes the directory, its contents, and all subdirectories. Returns true if the directory existed and false otherwise.

Warning

Clients that have already opened the directory might still insert data into its contents after removal.

DirectoryLayer.list(tr, path=())

Returns a list of names of the immediate subdirectories of the directory at path. Each name is a unicode string representing the last component of a subdirectory’s path.

DirectoryLayer.exists(tr, path)

Returns true if the directory at path exists and false otherwise.

DirectoryLayer.get_layer()

Returns the layer specified when the directory was created.

DirectoryLayer.get_path()

Returns the path with which the directory was opened.

DirectorySubspace

A directory subspace represents a specific directory and its contents. It stores the path with which it was opened and supports all DirectoryLayer methods for operating on itself and its subdirectories. It also implements all Subspace methods for working with the contents of that directory.

DirectorySubspace.move_to(tr, new_path)

Moves this directory to new_path, interpreting new_path absolutely. There is no effect on the physical prefix of the given directory or on clients that already have the directory open. The method will raise an exception if a directory already exists at new_path or the parent directory of new_path does not exist.

Returns the directory at its new location as a DirectorySubspace.

Locality information

The FoundationDB API comes with a set of functions for discovering the storage locations of keys within your cluster. This information can be useful for advanced users who wish to take into account the location of keys in the design of applications or processes.

fdb.locality.get_boundary_keys(db_or_tr, begin, end)

Returns a generator of keys k such that begin <= k < end and k is located at the start of a contiguous range stored on a single server.

The first parameter to this function may be either a Database or a Transaction. If it is passed a Transaction, the transaction will not be committed, reset, or modified in any way, nor will its transaction options (such as retry limit) be applied within the function. However, if the database is unavailable prior to the function call, any timeout set on the transaction will still trigger.

Like a Future object, the returned container issues asynchronous read operations to fetch the data in the range and may block while iterating over its values if the read has not completed.

This method is not transactional. It will return an answer no older than the Transaction or Database object it is passed, but the returned boundaries are an estimate and may not represent the exact boundary locations at any database version.

fdb.locality.get_addresses_for_key(tr, key)

Returns a fdb.FutureStringArray. You must call the fdb.Future.wait() method on this object to retrieve a list of public network addresses as strings, one for each of the storage servers responsible for storing key and its associated value.

Tenant management

The FoundationDB API includes functions to manage the set of tenants in a cluster.

fdb.tenant_management.create_tenant(db_or_tr, tenant_name)

Creates a new tenant in the cluster.

The tenant name can be either a byte string or a tuple and cannot start with the \xff byte. If a tuple is provided, the tuple will be packed using the tuple layer to generate the byte string tenant name.

If a database is provided to this function for the db_or_tr parameter, then this function will first check if the tenant already exists. If it does, it will fail with a tenant_already_exists error. Otherwise, it will create a transaction and attempt to create the tenant in a retry loop. If the tenant is created concurrently by another transaction, this function may still return successfully.

If a transaction is provided to this function for the db_or_tr parameter, then this function will not check if the tenant already exists. It is up to the user to perform that check if required. The user must also successfully commit the transaction in order for the creation to take effect.

fdb.tenant_management.delete_tenant(db_or_tr, tenant_name)

Delete a tenant from the cluster.

The tenant name can be either a byte string or a tuple. If a tuple is provided, the tuple will be packed using the tuple layer to generate the byte string tenant name.

It is an error to delete a tenant that still has data. To delete a non-empty tenant, first clear all of the keys in the tenant.

If a database is provided to this function for the db_or_tr parameter, then this function will first check if the tenant already exists. If it does not, it will fail with a tenant_not_found error. Otherwise, it will create a transaction and attempt to delete the tenant in a retry loop. If the tenant is deleted concurrently by another transaction, this function may still return successfully.

If a transaction is provided to this function for the db_or_tr parameter, then this function will not check if the tenant already exists. It is up to the user to perform that check if required. The user must also successfully commit the transaction in order for the deletion to take effect.