Python API
Installation
The FoundationDB Python API is compatible with Python 2.7 - 3.7. You will need to have a Python version within this range on your system before the FoundationDB Python API can be installed. Also please note that Python 3.7 no longer bundles a full copy of libffi, which is used for building the _ctypes module on non-macOS UNIX platforms. Hence, if you are using Python 3.7, you should make sure libffi is already installed on your system.
On macOS, the FoundationDB Python API is installed as part of the FoundationDB installation (see Installing FoundationDB client binaries). On Ubuntu or RHEL/CentOS, you will need to install the FoundationDB Python API manually via Python’s package manager pip
:
user@host$ pip install foundationdb
You can also download the FoundationDB Python API source directly from Downloads.
Note
The Python language binding is compatible with FoundationDB client binaries of version 2.0 or higher. When used with version 2.0.x client binaries, the API version must be set to 200 or lower.
After installation, the module fdb
should be usable from your Python installation or path. (The system default python is always used by the client installer on macOS.)
API versioning
When you import the fdb
module, it exposes only one useful symbol:
- fdb.api_version(version)
Specifies the version of the API that the application uses. This allows future versions of FoundationDB to make API changes without breaking existing programs. The current version of the API is 730.
Note
You must call fdb.api_version(...)
before using any other part of the API. Once you have done so, the rest of the API will become available in the fdb
module. This requirement includes use of the @fdb.transactional
decorator, which is called when your module is imported.
Note
FoundationDB encapsulates multiple versions of its interface by requiring the client to explicitly specify the version of the API it uses. The purpose of this design is to allow you to upgrade the server, client libraries, or bindings without having to modify client code. The client libraries support all previous versions of the API. The API version specified by the client is used to control the behavior of the binding. You can therefore upgrade to more recent packages (and thus receive various improvements) without having to change your code.
Warning
When using the multi-version client API, setting an API version that is not supported by a particular client library will prevent that client from being used to connect to the cluster. In particular, you should not advance the API version of your application after upgrading your client until the cluster has also been upgraded.
For API changes between version 13 and 730 (for the purpose of porting older programs), see Release Notes and API Version Upgrade Guide.
Opening a database
After importing the fdb
module and selecting an API version, you probably want to open a Database
using open()
:
import fdb
fdb.api_version(730)
db = fdb.open()
- fdb.open(cluster_file=None, event_model=None)
Connects to the cluster specified by the cluster file. This function is often called without any parameters, using only the defaults. If no cluster file is passed, FoundationDB automatically determines a cluster file with which to connect to a cluster.
A single client can use this function multiple times to connect to different clusters simultaneously, with each invocation requiring its own cluster file. To connect to multiple clusters running at different, incompatible versions, the multi-version client API must be used.
- fdb.options
A singleton providing options which affect the entire FoundationDB client. Note that network options can also be set using environment variables.
Note
It is an error to set these options after the first call to
fdb.open()
anywhere in your application.- fdb.options.set_knob(knob)
Sets internal tuning or debugging knobs. The argument to this function should be a string representing the knob name and the value, e.g. “transaction_size_limit=1000”.
- fdb.options.set_trace_enable(output_directory=None)
Enables trace file generation on this FoundationDB client. Trace files will be generated in the specified output directory. If the directory is specified as
None
, then the output directory will be the current working directory.Warning
The specified output directory must be unique to this client. In the present release, trace logging does not allow two clients to share a directory.
- fdb.options.set_trace_max_logs_size(bytes)
Sets the maximum size in bytes for the sum of this FoundationDB client’s trace output files in a single log directory.
- fdb.options.set_trace_roll_size(bytes)
Sets the maximum size in bytes of a single trace output file for this FoundationDB client.
- fdb.options.set_trace_format(format)
Select the format of the trace files for this FoundationDB client. xml (the default) and json are supported.
- fdb.options.set_trace_clock_source(source)
Select clock source for trace files. now (the default) or realtime are supported.
- fdb.options.set_disable_multi_version_client_api()
Disables the multi-version client API and instead uses the local client directly. Must be set before setting up the network.
- fdb.options.set_callbacks_on_external_threads()
If set, callbacks from external client libraries can be called from threads created by the FoundationDB client library. Otherwise, callbacks will be called from either the thread used to add the callback or the network thread. Setting this option can improve performance when connected using an external client, but may not be safe to use in all environments. Must be set before setting up the network. WARNING: This feature is considered experimental at this time.
- fdb.options.set_external_client_library(path_to_lib)
Adds an external client library for use by the multi-version client API. Must be set before setting up the network.
- fdb.options.set_external_client_directory(path_to_lib_directory)
Searches the specified path for dynamic libraries and adds them to the list of client libraries for use by the multi-version client API. Must be set before setting up the network.
Note
The following options are only used when connecting to a TLS-enabled cluster.
- fdb.options.set_tls_plugin(plugin_path_or_name)
Sets the TLS plugin to load. This option, if used, must be set before any other TLS options.
- fdb.options.set_tls_cert_path(path_to_file)
Sets the path for the file from which the certificate chain will be loaded.
- fdb.options.set_tls_key_path(path_to_file)
Sets the path for the file from which to load the private key corresponding to your own certificate.
- fdb.options.set_tls_verify_peers(criteria)
- fdb.options.set_tls_cert_bytes(bytes)
Sets the certificate chain.
- fdb.options.set_tls_key_bytes(bytes)
Set the private key corresponding to your own certificate.
- fdb.options.set_tls_ca_bytes(ca_bundle)
Sets the certificate authority bundle.
- fdb.options.set_tls_ca_path(path)
Sets the file from which to load the certificate authority bundle.
- fdb.options.set_tls_password(password)
Sets the passphrase for encrypted private key. Password should be set before setting the key for the password to be used.
- fdb.options.set_disable_local_client()
Prevents connections through the local client, allowing only connections through externally loaded client libraries.
- fdb.options.set_client_threads_per_version(number)
Spawns multiple worker threads for each version of the client that is loaded. Setting this to a number greater than one implies disable_local_client.
- fdb.options.set_disable_client_statistics_logging()
Disables logging of client statistics, such as sampled transaction activity.
- fdb.options.set_enable_run_loop_profiling()
Enables debugging feature to perform run loop profiling. Requires trace logging to be enabled. WARNING: this feature is not recommended for use in production.
- fdb.options.set_distributed_client_tracer(tracer_type)
Sets a tracer to run on the client. Should be set to the same value as the tracer set on the server.
Please refer to fdboptions.py (generated) for a comprehensive list of options.
Keys and values
Keys and values in FoundationDB are simple byte strings. In Python 2, a byte string is a string of type str
. In Python 3, a byte string has type bytes
.
To encode other data types, see Encoding data types and the tuple layer.
as_foundationdb_key
and as_foundationdb_value
In some cases, you may have objects that are used to represent specific keys or values (for example, see Subspace
). As a convenience, the language binding API can work seamlessly with such objects if they implement the as_foundationdb_key()
or as_foundationdb_value()
methods, respectively. API methods that accept a key will alternately accept an object that implements the as_foundationdb_key()
method. Likewise, API methods accepting a value will also accept an object that implements the as_foundationdb_value()
method.
Warning
as_foundationdb_key()
and as_foundationdb_value()
are not intended to implement serialization protocols for object storage. Use these functions only when your object represents a specific key or value.
KeyValue objects
- class fdb.KeyValue
Represents a single key-value pair in the database. This is a simple value type; mutating it won’t affect your
Transaction
orDatabase
.KeyValue supports the Python iterator protocol so that you can unpack a key and value directly into two variables:
for key, value in tr[begin:end]: pass
Attributes
- KeyValue.key
- KeyValue.value
Key selectors
FoundationDB’s lexicographically ordered data model permits finding keys based on their order (for example, finding the first key in the database greater than a given key). Key selectors represent a description of a key in the database that could be resolved to an actual key by Transaction.get_key()
or used directly as the beginning or end of a range in Transaction.get_range()
.
For more about how key selectors work, see Key selectors.
- class fdb.KeySelector(key, or_equal, offset)
Creates a key selector with the given reference key, equality flag, and offset. It is usually more convenient to obtain a key selector with one of the following methods:
- classmethod KeySelector.last_less_than(key)
Returns a key selector referencing the last (greatest) key in the database less than the specified key.
- classmethod KeySelector.last_less_or_equal(key)
Returns a key selector referencing the last (greatest) key less than, or equal to, the specified key.
- classmethod KeySelector.first_greater_than(key)
Returns a key selector referencing the first (least) key greater than the specified key.
- classmethod KeySelector.first_greater_or_equal(key)
Returns a key selector referencing the first key greater than, or equal to, the specified key.
KeySelector + offset
Adding an integer
offset
to aKeySelector
returns a key selector referencing a keyoffset
keys after the originalKeySelector
. FoundationDB does not efficiently resolve key selectors with large offsets, so Key selectors with large offsets are slow.KeySelector - offset
Subtracting an integer
offset
from aKeySelector
returns a key selector referencing a keyoffset
keys before the originalKeySelector
. FoundationDB does not efficiently resolve key selectors with large offsets, so Key selectors with large offsets are slow.
Database objects
- class fdb.Database
A Database
represents a FoundationDB database — a mutable, lexicographically ordered mapping from binary keys to binary values. Although Database
provides convenience methods for reading and writing, modifications to a database are usually via transactions, which are usually created and committed automatically by the @fdb.transactional
decorator.
Note
The convenience methods provided by Database
have the same signature as the corresponding methods of Transaction
. However, most of the Database
methods are fully synchronous. (An exception is the methods for watches.) As a result, the Database
methods do not support the use of implicit parallelism with futures.
- Database.create_transaction()
Returns a new
Transaction
object. Consider using the@fdb.transactional
decorator to create transactions instead, since it will automatically provide you with appropriate retry behavior.
- Database.open_tenant(tenant_name)
Opens an existing tenant to be used for running transactions and returns it as a :class`Tenant` object.
The tenant name can be either a byte string or a tuple. If a tuple is provided, the tuple will be packed using the tuple layer to generate the byte string tenant name.
Note
Opening a tenant does not check its existence in the cluster. If the tenant does not exist, attempts to read or write data with it will fail.
- Database.get(key)
Returns the value associated with the specified key in the database (or
None
if the key does not exist). This read is fully synchronous.
X = db[key]
Shorthand for
X = db.get(key)
.
- Database.get_key(key_selector)
Returns the key referenced by the specified
KeySelector
. This read is fully synchronous.The key is cached, providing a potential performance benefit. However, the value of the key is also retrieved, using network bandwidth.
- Database.get_range(begin, end[, limit, reverse, streaming_mode])
Returns all keys
k
such thatbegin <= k < end
and their associated values as a list ofKeyValue
objects. Note the exclusion ofend
from the range. This read is fully synchronous.Each of
begin
andend
may be a key or aKeySelector
. Note that in the case of aKeySelector
, the exclusion ofend
from the range still applies.If
limit
is specified, then only the firstlimit
keys (and their values) in the range will be returned.If
reverse
is True, then the lastlimit
keys in the range will be returned in reverse order. Reading ranges in reverse is supported natively by the database and should have minimal extra cost.If
streaming_mode
is specified, it must be a value from theStreamingMode
enumeration. It provides a hint to FoundationDB about how to retrieve the specified range. This option should generally not be specified, allowing FoundationDB to retrieve the full range very efficiently.
X = db[begin:end]
Shorthand for
X = db.get_range(begin, end)
. The default slice begin is''
; the default slice end is'\xFF'
.X = db[begin:end:-1]
Shorthand for
X = db.get_range(begin, end, reverse=True)
. The default slice begin is''
; the default slice end is'\xFF'
.
- Database.get_range_startswith(prefix[, limit, reverse, streaming_mode])
Returns all keys
k
such thatk.startswith(prefix)
, and their associated values, as a list ofKeyValue
objects. Thelimit
,reverse
andstreaming_mode
parameters have the same meanings as inDatabase.get_range()
.
- Database.set(key, value)
Associates the given
key
andvalue
. Overwrites any prior value associated withkey
. This change will be committed immediately, and is fully synchronous.
db[key] = value
Shorthand for
db.set(key, value)
.
- Database.clear(key)
Removes the specified key (and any associated value), if it exists. This change will be committed immediately, and is fully synchronous.
del db[key]
Shorthand for
db.clear(key)
.
- Database.clear_range(begin, end)
Removes all keys
k
such thatbegin <= k < end
, and their associated values. This change will be committed immediately, and is fully synchronous.
del db[begin:end]
Shorthand for
db.clear_range(begin, end)
. The default slice begin is''
; the default slice end is'\xFF'
.
- Database.clear_range_startswith(prefix)
Removes all keys
k
such thatk.startswith(prefix)
, and their associated values. This change will be committed immediately, and is fully synchronous.
- Database.get_and_watch(key)
Returns a tuple
value, watch
, wherevalue
is the value associated withkey
orNone
if the key does not exist, andwatch
is aFutureVoid
that will become ready aftervalue
changes.See
Transaction.watch()
for a general description of watches and their limitations.
- Database.set_and_watch(key, value)
Sets
key
tovalue
and returns aFutureVoid
that will become ready after a subsequent change tovalue
.See
Transaction.watch()
for a general description of watches and their limitations.
- Database.clear_and_watch(key)
Removes
key
(and any associated value) if it exists and returns aFutureVoid
that will become ready after the value is subsequently set.See
Transaction.watch()
for a general description of watches and their limitations.
- Database.add(key, param)
- Database.bit_and(key, param)
- Database.bit_or(key, param)
- Database.bit_xor(key, param)
These atomic operations behave exactly like the associated operations on
Transaction
objects except that the change will immediately be committed, and is fully synchronous.Note
Note that since some atomic operations are not idempotent, the implicit use of the
@fdb.transactional
decorator could interact with a commit_unknown_result exception in unpredictable ways. For more information, see Transactions with unknown results.
Database options
Database options alter the behavior of FoundationDB databases.
- Database.options.set_location_cache_size(size)
Set the size of the client location cache. Raising this value can boost performance in very large databases where clients access data in a near-random pattern. This value must be an integer in the range [0, 231-1]. Defaults to 100000.
- Database.options.set_max_watches(max_watches)
Set the maximum number of watches allowed to be outstanding on a database connection. Increasing this number could result in increased resource usage. Reducing this number will not cancel any outstanding watches. Defaults to 10000 and cannot be larger than 1000000.
- Database.options.set_machine_id(id)
Specify the machine ID of a server to be preferentially used for database operations. ID must be a string of up to 16 hexadecimal digits that was used to configure fdbserver processes. Load balancing uses this option for location-awareness, attempting to send database operations first to servers on a specified machine, then a specified datacenter, then returning to its default algorithm.
- Database.options.set_datacenter_id(id)
Specify the datacenter ID to be preferentially used for database operations. ID must be a string of up to 16 hexadecimal digits that was used to configure fdbserver processes. Load balancing uses this option for location-awareness, attempting to send database operations first to servers on a specified machine, then a specified datacenter, then returning to its default algorithm.
- Database.options.set_transaction_timeout(timeout)
Set the default timeout duration in milliseconds after which all transactions created by this database will automatically be cancelled. This is equivalent to calling
Transaction.options.set_timeout()
on each transaction created by this database. This option can only be called if the API version is at least 610.
- Database.options.set_transaction_retry_limit(retry_limit)
Set the default maximum number of retries for each transaction after which additional calls to
Transaction.on_error()
will throw the most recently seen error code. This is equivalent to callingTransaction.options.set_retry_limit()
on each transaction created by this database.
- Database.options.set_transaction_max_retry_delay(delay_limit)
Set the default maximum backoff delay incurred by each transaction in the call to
Transaction.on_error()
if the error is retryable. This is equivalent to callingTransaction.options.set_max_retry_delay()
on each transaction created by this database.
- Database.options.set_transaction_size_limit(size_limit)
Set the default maximum transaction size in bytes. This is equivalent to calling
Database.options.set_transaction_size_limit()
on each transaction created by this database.
- Database.options.set_transaction_causal_read_risky()
Transactions do not require the strict causal consistency guarantee that FoundationDB provides by default. The read version will be committed, and usually will be the latest committed, but might not be the latest committed in the event of a simultaneous fault and misbehaving clock. Enabling this option is equivalent to calling
Transaction.options.set_causal_read_risky()
on each transaction created by this database.
- Database.options.set_transaction_logging_max_field_length(size_limit)
Sets the maximum escaped length of key and value fields to be logged to the trace file via the LOG_TRANSACTION option. This is equivalent to calling
Transaction.options.set_transaction_logging_max_field_length()
on each transaction created by this database.
- Database.options.set_snapshot_ryw_enable()
If this option has been set an equal or more times with this database than the disable option, snapshot reads will see the effects of prior writes in the same transaction. Enabling this option is equivalent to calling
Transaction.options.set_snapshot_ryw_enable()
on each transaction created by this database.
- Database.options.set_snapshot_ryw_disable()
If this option has been set more times with this database than the disable option, snapshot reads will not see the effects of prior writes in the same transaction. Disabling this option is equivalent to calling
Transaction.options.set_snapshot_ryw_disable()
on each transaction created by this database.
Tenant objects
- class fdb.Tenant
Tenant
represents a FoundationDB tenant. Tenants are optional named transaction domains that can be used to provide multiple disjoint key-spaces to client applications. A transaction created in a tenant will be limited to the keys contained within that tenant, and transactions operating on different tenants can use the same key names without interfering with each other.
- Tenant.create_transaction()
Returns a new
Transaction
object. Consider using the@fdb.transactional
decorator to create transactions instead, since it will automatically provide you with appropriate retry behavior.
Transactional decoration
- @fdb.transactional
The
@fdb.transactional
decorator is a convenience designed to concisely wrap a function with logic to automatically create a transaction and retry until success.For example:
@fdb.transactional def simple_function(tr, x, y): tr[b'foo'] = x tr[b'bar'] = y
The
@fdb.transactional
decorator makessimple_function
a transactional function. All functions using this decorator must have an argument namedtr
. This specially named argument is passed a transaction that the function can use to do reads and writes.A caller of a transactionally decorated function can pass a
Database
orTenant
instead of a transaction for thetr
parameter. Then a transaction will be created automatically, and automatically committed before returning to the caller. The decorator will retry calling the decorated function until the transaction successfully commits.If
db
is aDatabase
orTenant
, a call likesimple_function(db, 'a', 'b')
is equivalent to something like
tr = db.create_transaction() while True: try: simple_function(tr, 'a', 'b') tr.commit().wait() break except fdb.FDBError as e: tr.on_error(e).wait()
A caller may alternatively pass an actual transaction to the
tr
parameter. In this case, the transactional function will not attempt to commit the transaction or to retry errors, since that is the responsibility of the caller who owns the transaction. This design allows transactionally decorated functions to be composed freely into larger transactions.Note
In some failure scenarios, it is possible that your transaction will be executed twice. See Transactions with unknown results for more information.
Transaction objects
- class fdb.Transaction
A Transaction
object represents a FoundationDB database transaction. All operations on FoundationDB take place, explicitly or implicitly, through a Transaction
.
In FoundationDB, a transaction is a mutable snapshot of a database. All read and write operations on a transaction see and modify an otherwise-unchanging version of the database and only change the underlying database if and when the transaction is committed. Read operations do see the effects of previous write operations on the same transaction. Committing a transaction usually succeeds in the absence of conflicts.
Transactions group operations into a unit with the properties of atomicity, isolation, and durability. Transactions also provide the ability to maintain an application’s invariants or integrity constraints, supporting the property of consistency. Together these properties are known as ACID.
Transactions are also causally consistent: once a transaction has been successfully committed, all subsequently created transactions will see the modifications made by it.
The most convenient way to use Transactions is using the @fdb.transactional
decorator.
Keys and values in FoundationDB are byte strings (str
in Python 2.x, bytes
in 3.x). To encode other data types, see the fdb.tuple
module and Encoding data types.
Attributes
- Transaction.db
The
Database
that this transaction is interacting with.
Reading data
- Transaction.get(key)
Returns a (future)
Value
associated with the specified key in the database.To check whether the specified key was present in the database, call
Value.present()
on the return value.
X = tr[key]
Shorthand for
X = tr.get(key)
.
- Transaction.get_key(key_selector)
Returns the (future)
Key
referenced by the specifiedKeySelector
.By default, the key is cached for the duration of the transaction, providing a potential performance benefit. However, the value of the key is also retrieved, using network bandwidth. Invoking
Transaction.options.set_read_your_writes_disable()
will avoid both the caching and the increased network bandwidth.
- Transaction.get_range(begin, end[, limit, reverse, streaming_mode])
Returns all keys
k
such thatbegin <= k < end
and their associated values as an iterator yieldingKeyValue
objects. Note the exclusion ofend
from the range.Like a Future object, the returned iterator issues asynchronous read operations. It fetches the data in one or more efficient batches (depending on the value of the
streaming_mode
parameter). However, the iterator will block if iteration reaches a value whose read has not yet completed.Each of
begin
andend
may be a key or aKeySelector
. Note that in the case of aKeySelector
, the exclusion ofend
from the range still applies.If
limit
is specified, then only the firstlimit
keys (and their values) in the range will be returned.If
reverse
is True, then the lastlimit
keys in the range will be returned in reverse order. Reading ranges in reverse is supported natively by the database and should have minimal extra cost.If
streaming_mode
is specified, it must be a value from theStreamingMode
enumeration. It provides a hint to FoundationDB about how the returned container is likely to be used. The default isStreamingMode.iterator
.
X = tr[begin:end]
Shorthand for
X = tr.get_range(begin, end)
. The default slice begin is''
; the default slice end is'\xFF'
.X = tr[begin:end:-1]
Shorthand for
X = tr.get_range(begin, end, reverse=True)
. The default slice begin is''
; the default slice end is'\xFF'
.
- Transaction.get_range_startswith(prefix[, limit, reverse, streaming_mode])
Returns all keys
k
such thatk.startswith(prefix)
, and their associated values, as a container ofKeyValue
objects (seeTransaction.get_range()
for a description of the returned container).The
limit
,reverse
andstreaming_mode
parameters have the same meanings as inTransaction.get_range()
.
Snapshot reads
- Transaction.snapshot
Snapshot reads selectively relax FoundationDB’s isolation property, reducing conflicts but making it harder to reason about concurrency.
By default, FoundationDB transactions guarantee strictly serializable isolation, resulting in a state that is as if transactions were executed one at a time, even if they were executed concurrently. Serializability has little performance cost when there are few conflicts but can be expensive when there are many. FoundationDB therefore also permits individual reads within a transaction to be done as snapshot reads.
Snapshot reads differ from ordinary (strictly serializable) reads by permitting the values they read to be modified by concurrent transactions, whereas strictly serializable reads cause conflicts in that case. Like strictly serializable reads, snapshot reads see the effects of prior writes in the same transaction. For more information on the use of snapshot reads, see Snapshot reads.
Snapshot reads also interact with transaction commit a little differently than normal reads. If a snapshot read is outstanding when transaction commit is called that read will immediately return an error. (Normally, transaction commit will wait until outstanding reads return before committing.)
- Transaction.snapshot.db
The
Database
that this transaction is interacting with.
- Transaction.snapshot.get(key)
Like
Transaction.get()
, but as a snapshot read.
X = tr.snapshot[key]
Shorthand for
X = tr.snapshot.get(key)
.
- Transaction.snapshot.get_key(key_selector) key
Like
Transaction.get_key()
, but as a snapshot read.
- Transaction.snapshot.get_range(begin, end[, limit, reverse, streaming_mode])
Like
Transaction.get_range()
, but as a snapshot read.
X = tr.snapshot[begin:end]
Shorthand for
X = tr.snapshot.get_range(begin, end)
. The default slice begin is''
; the default slice end is'\xFF'
.X = tr.snapshot[begin:end:-1]
Shorthand for
X = tr.snapshot.get_range(begin, end, reverse=True)
. The default slice begin is''
; the default slice end is'\xFF'
.
- Transaction.snapshot.get_range_startswith(prefix[, limit, reverse, streaming_mode])
Like
Transaction.get_range_startswith()
, but as a snapshot read.
- Transaction.snapshot.get_read_version()
Identical to
Transaction.get_read_version()
(since snapshot and strictly serializable reads use the same read version).
Writing data
- Transaction.set(key, value)
Associates the given
key
andvalue
. Overwrites any prior value associated withkey
. Returns immediately, having modified the snapshot represented by thisTransaction
.
tr[key] = value
Shorthand for
tr.set(key,value)
.
- Transaction.clear(key)
Removes the specified key (and any associated value), if it exists. Returns immediately, having modified the snapshot represented by this
Transaction
.
del tr[key]
Shorthand for
tr.clear(key)
.
- Transaction.clear_range(begin, end)
Removes all keys
k
such thatbegin <= k < end
, and their associated values. Returns immediately, having modified the snapshot represented by thisTransaction
.Range clears are efficient with FoundationDB – clearing large amounts of data will be fast. However, this will not immediately free up disk - data for the deleted range is cleaned up in the background. For purposes of computing the transaction size, only the begin and end keys of a clear range are counted. The size of the data stored in the range does not count against the transaction size limit.
Note
Unlike in the case of
get_range()
,begin
andend
must be keys (byte strings), notKeySelector
s. (Resolving arbitrary key selectors would prevent this method from returning immediately, introducing concurrency issues.)
del tr[begin:end]
Shorthand for
tr.clear_range(begin,end)
. The default slice begin is''
; the default slice end is'\xFF'
.
- Transaction.clear_range_startswith(prefix)
Removes all the keys
k
such thatk.startswith(prefix)
, and their associated values. Returns immediately, having modified the snapshot represented by thisTransaction
.Range clears are efficient with FoundationDB – clearing large amounts of data will be fast. However, this will not immediately free up disk - data for the deleted range is cleaned up in the background. For purposes of computing the transaction size, only the begin and end keys of a clear range are counted. The size of the data stored in the range does not count against the transaction size limit.
Atomic operations
An atomic operation is a single database command that carries out several logical steps: reading the value of a key, performing a transformation on that value, and writing the result. Different atomic operations perform different transformations. Like other database operations, an atomic operation is used within a transaction; however, its use within a transaction will not cause the transaction to conflict.
Atomic operations do not expose the current value of the key to the client but simply send the database the transformation to apply. In regard to conflict checking, an atomic operation is equivalent to a write without a read. It can only cause other transactions performing reads of the key to conflict.
By combining these logical steps into a single, read-free operation, FoundationDB can guarantee that the transaction will not conflict due to the operation. This makes atomic operations ideal for operating on keys that are frequently modified. A common example is the use of a key-value pair as a counter.
Warning
If a transaction uses both an atomic operation and a strictly serializable read on the same key, the benefits of using the atomic operation (for both conflict checking and performance) are lost.
In each of the methods below, param
should be a string appropriately packed to represent the desired value. For example:
# wrong
tr.add('key', 1)
# right
import struct
tr.add('key', struct.pack('<q', 1))
- Transaction.add(key, param)
Performs an addition of little-endian integers. If the existing value in the database is not present or shorter than
param
, it is first extended to the length ofparam
with zero bytes. Ifparam
is shorter than the existing value in the database, the existing value is truncated to match the length ofparam
. In case of overflow, the result is truncated to the width ofparam
.The integers to be added must be stored in a little-endian representation. They can be signed in two’s complement representation or unsigned. You can add to an integer at a known offset in the value by prepending the appropriate number of zero bytes to
param
and padding with zero bytes to match the length of the value. However, this offset technique requires that you know the addition will not cause the integer field within the value to overflow.
- Transaction.bit_and(key, param)
Performs a bitwise “and” operation. If the existing value in the database is not present, then
param
is stored in the database. If the existing value in the database is shorter thanparam
, it is first extended to the length ofparam
with zero bytes. Ifparam
is shorter than the existing value in the database, the existing value is truncated to match the length ofparam
.
- Transaction.bit_or(key, param)
Performs a bitwise “or” operation. If the existing value in the database is not present or shorter than
param
, it is first extended to the length ofparam
with zero bytes. Ifparam
is shorter than the existing value in the database, the existing value is truncated to match the length ofparam
.
- Transaction.bit_xor(key, param)
Performs a bitwise “xor” operation. If the existing value in the database is not present or shorter than
param
, it is first extended to the length ofparam
with zero bytes. Ifparam
is shorter than the existing value in the database, the existing value is truncated to match the length ofparam
.
- Transaction.compare_and_clear(key, param)
Performs an atomic
compare and clear
operation. If the existing value in the database is equal to the given value, then given key is cleared.
- Transaction.max(key, param)
Sets the value in the database to the larger of the existing value and
param
. If the existing value in the database is not present or shorter thanparam
, it is first extended to the length ofparam
with zero bytes. Ifparam
is shorter than the existing value in the database, the existing value is truncated to match the length ofparam
.Both the existing value and
param
are treated as unsigned integers. (This differs from the behavior of atomic addition.)
- Transaction.byte_max(key, param)
Performs lexicographic comparison of byte strings. If the existing value in the database is not present, then
param
is stored. Otherwise the larger of the two values is then stored in the database.
- Transaction.min(key, param)
Sets the value in the database to the smaller of the existing value and
param
. If the existing value in the database is not present, thenparam
is stored in the database. If the existing value in the database is shorter thanparam
, it is first extended to the length ofparam
with zero bytes. Ifparam
is shorter than the existing value in the database, the existing value is truncated to match the length ofparam
.Both the existing value and
param
are treated as unsigned integers. (This differs from the behavior of atomic addition.)
- Transaction.byte_min(key, param)
Performs lexicographic comparison of byte strings. If the existing value in the database is not present, then
param
is stored. Otherwise the smaller of the two values is then stored in the database.
- Transaction.set_versionstamped_key(key, param)
Transforms
key
using a versionstamp for the transaction. This key must be at least 14 bytes long. The final 4 bytes will be interpreted as a 32-bit little-endian integer denoting an index into the key at which to perform the transformation, and then trimmed off the key. The 10 bytes in the key beginning at the index will be overwritten with the versionstamp. If the index plus 10 bytes points past the end of the key, the result will be an error. Sets the transformed key in the database toparam
.A versionstamp is a 10 byte, unique, monotonically (but not sequentially) increasing value for each committed transaction. The first 8 bytes are the committed version of the database (serialized in big-endian order). The last 2 bytes are monotonic in the serialization order for transactions (serialized in big-endian order).
A transaction is not permitted to read any transformed key or value previously set within that transaction, and an attempt to do so will result in an
accessed_unreadable
error. The range of keys marked unreadable when setting a versionstamped key begins at the transactions’s read version if it is known, otherwise a versionstamp of all0x00
bytes is conservatively assumed. The upper bound of the unreadable range is a versionstamp of all0xFF
bytes.Warning
At this time, versionstamped keys are not compatible with the Tuple layer except in Java, Python, and Go. Note that this implies versionstamped keys may not be used with the Subspace and Directory layers except in those languages.
- Transaction.set_versionstamped_value(key, param)
Transforms
param
using a versionstamp for the transaction. This parameter must be at least 14 bytes long. The final 4 bytes will be interpreted as a 32-bit little-endian integer denoting an index into the parameter at which to perform the transformation, and then trimmed off the key. The 10 bytes in the parameter beginning at the index will be overwritten with the versionstamp. If the index plus 10 bytes points past the end of the parameter, the result will be an error. Setskey
in the database to the transformed parameter.A versionstamp is a 10 byte, unique, monotonically (but not sequentially) increasing value for each committed transaction. The first 8 bytes are the committed version of the database (serialized in big-endian order). The last 2 bytes are monotonic in the serialization order for transactions (serialized in big-endian order).
A transaction is not permitted to read any transformed key or value previously set within that transaction, and an attempt to do so will result in an
accessed_unreadable
error. The range of keys marked unreadable when setting a versionstamped key begins at the transactions’s read version if it is known, otherwise a versionstamp of all0x00
bytes is conservatively assumed. The upper bound of the unreadable range is a versionstamp of all0xFF
bytes.Warning
At this time, versionstamped values are not compatible with the Tuple layer except in Java, Python, and Go. Note that this implies versionstamped values may not be used with the Subspace and Directory layers except in those languages.
Committing
- Transaction.commit()
Attempt to commit the changes made in the transaction to the database. Returns a
FutureVoid
representing the asynchronous result of the commit. You must call theFuture.wait()
method on the returnedFutureVoid
, which will raise an exception if the commit failed.As with other client/server databases, in some failure scenarios a client may be unable to determine whether a transaction succeeded. In these cases,
Transaction.commit()
will raise a commit_unknown_result exception. TheTransaction.on_error()
function treats this exception as retryable, so retry loops that don’t check for commit_unknown_result could execute the transaction twice. In these cases, you must consider the idempotence of the transaction. For more information, see Transactions with unknown results.Normally, commit will wait for outstanding reads to return. However, if those reads were snapshot reads or the transaction option for disabling “read-your-writes” has been invoked, any outstanding reads will immediately return errors.
Note
Consider using the
@fdb.transactional
decorator, which not only callsDatabase.create_transaction()
or :meth`Tenant.create_transaction` andTransaction.commit()
for you but also implements the required error handling and retry logic for transactions.Warning
If any operation is performed on a transaction after a commit has been issued but before it has returned, both the commit and the operation will raise a used_during_commit exception. In this case, all subsequent operations on this transaction will raise this error until
reset
is called.
- Transaction.on_error(exception)
Determine whether an exception raised by a
Transaction
method is retryable. Returns aFutureVoid
. You must call theFuture.wait()
method on theFutureVoid
, which will return after a delay if the exception was retryable, or re-raise the exception if it was not.Note
Consider using the
@fdb.transactional
decorator, which calls this method for you.
- Transaction.reset()
Rollback a transaction, completely resetting it to its initial state. This is logically equivalent to destroying the transaction and creating a new one.
- Transaction.cancel()
Cancels the transaction. All pending or future uses of the transaction will raise a transaction_cancelled exception. The transaction can be used again after it is
reset
.Warning
Be careful if you are using
Transaction.reset()
andTransaction.cancel()
concurrently with the same transaction. Since they negate each other’s effects, a race condition between these calls will leave the transaction in an unknown state.Warning
If your program attempts to cancel a transaction after
Transaction.commit()
has been called but before it returns, unpredictable behavior will result. While it is guaranteed that the transaction will eventually end up in a cancelled state, the commit may or may not occur. Moreover, even if the call toTransaction.commit()
appears to raise a transaction_cancelled exception, the commit may have occurred or may occur in the future. This can make it more difficult to reason about the order in which transactions occur.
Watches
- Transaction.watch(key)
Creates a watch and returns a
FutureVoid
that will become ready when the watch reports a change to the value of the specified key.A watch’s behavior is relative to the transaction that created it. A watch will report a change in relation to the key’s value as readable by that transaction. The initial value used for comparison is either that of the transaction’s read version or the value as modified by the transaction itself prior to the creation of the watch. If the value changes and then changes back to its initial value, the watch might not report the change.
Until the transaction that created it has been committed, a watch will not report changes made by other transactions. In contrast, a watch will immediately report changes made by the transaction itself. Watches cannot be created if the transaction has set
Transaction.options.set_read_your_writes_disable()
, and an attempt to do so will raise an watches_disabled exception.If the transaction used to create a watch encounters an exception during commit, then the watch will be set with that exception. A transaction whose commit result is unknown will set all of its watches with the commit_unknown_result exception. If an uncommitted transaction is reset or destroyed, then any watches it created will be set with the transaction_cancelled exception.
By default, each database connection can have no more than 10,000 watches that have not yet reported a change. When this number is exceeded, an attempt to create a watch will raise a too_many_watches exception. This limit can be changed using
Database.options.set_max_watches()
. Because a watch outlives the transaction that creates it, any watch that is no longer needed should be cancelled by callingFuture.cancel()
on its returned future.
Conflict ranges
Note
Most applications will use the strictly serializable isolation that transactions provide by default and will not need to manipulate conflict ranges.
The following make it possible to add conflict ranges to a transaction.
- Transaction.add_read_conflict_range(begin, end)
Adds a range of keys to the transaction’s read conflict ranges as if you had read the range. As a result, other transactions that write a key in this range could cause the transaction to fail with a conflict.
- Transaction.add_read_conflict_key(key)
Adds a key to the transaction’s read conflict ranges as if you had read the key. As a result, other transactions that concurrently write this key could cause the transaction to fail with a conflict.
- Transaction.add_write_conflict_range(begin, end)
Adds a range of keys to the transaction’s write conflict ranges as if you had cleared the range. As a result, other transactions that concurrently read a key in this range could fail with a conflict.
- Transaction.add_write_conflict_key(key)
Adds a key to the transaction’s write conflict ranges as if you had written the key. As a result, other transactions that concurrently read this key could fail with a conflict.
Versions
Most applications should use the read version that FoundationDB determines automatically during the transaction’s first read, and ignore all of these methods.
- Transaction.set_read_version(version)
Infrequently used. Sets the database version that the transaction will read from the database. The database cannot guarantee causal consistency if this method is used (the transaction’s reads will be causally consistent only if the provided read version has that property).
- Transaction.get_read_version()
Infrequently used. Returns a
FutureVersion
representing the transaction’s (future) read version. You must call theFuture.wait()
method on the returned object to retrieve the version as an integer.
- Transaction.get_committed_version()
Infrequently used. Gets the version number at which a successful commit modified the database. This must be called only after the successful (non-error) completion of a call to
Transaction.commit()
on this Transaction, or the behavior is undefined. Read-only transactions do not modify the database when committed and will have a committed version of -1. Keep in mind that a transaction which reads keys and then sets them to their current values may be optimized to a read-only transaction.
- Transaction.get_versionstamp()
Infrequently used. Returns a future which will contain the versionstamp which was used by any versionstamp operations in this transaction. This function must be called before a call to
Transaction.commit()
on this Transaction. The future will be ready only after the successful completion of a call toTransaction.commit()
on this Transaction. Read-only transactions do not modify the database when committed and will result in the future completing with an error. Keep in mind that a transaction which reads keys and then sets them to their current values may be optimized to a read-only transaction.
Transaction misc functions
- Transaction.get_estimated_range_size_bytes(begin_key, end_key)
Gets the estimated byte size of the given key range. Returns a
FutureInt64
.Note
The estimated size is calculated based on the sampling done by FDB server. The sampling algorithm works roughly in this way: the larger the key-value pair is, the more likely it would be sampled and the more accurate its sampled size would be. And due to that reason it is recommended to use this API to query against large ranges for accuracy considerations. For a rough reference, if the returned size is larger than 3MB, one can consider the size to be accurate.
- Transaction.get_range_split_points(self, begin_key, end_key, chunk_size)
Gets a list of keys that can split the given range into (roughly) equally sized chunks based on
chunk_size
. Returns aFutureKeyArray
. .. note:: The returned split points contain the start key and end key of the given range
- Transaction.get_approximate_size()
Gets the the approximate transaction size so far, which is the summation of the estimated size of mutations, read conflict ranges, and write conflict ranges. Returns a
FutureInt64
.
Transaction options
Transaction options alter the behavior of FoundationDB transactions. FoundationDB defaults to extremely safe transaction behavior, and we have worked hard to make the performance excellent with the default setting, so you should not often need to use transaction options.
- Transaction.options.set_snapshot_ryw_disable()
If this option is set more times in this transaction than the enable option, snapshot reads will not see the effects of prior writes in the same transaction. Note that prior to API version 300, this was the default behavior. This option can be disabled one or more times at the database level by calling
Database.options.set_snapshot_ryw_disable()
.
- Transaction.options.set_snapshot_ryw_enable()
If this option is set an equal or more times in this transaction than the disable option, snapshot reads will see the effects of prior writes in the same transaction. This option can be enabled one or more times at the database-level by calling
Database.options.set_snapshot_ryw_enable()
.
- Transaction.options.set_priority_batch()
This transaction should be treated as low priority (other transactions will be processed first). Batch priority transactions will also be throttled at load levels smaller than for other types of transactions and may be fully cut off in the event of machine failures. Useful for doing potentially saturating batch work without interfering with the latency of other operations.
- Transaction.options.set_priority_system_immediate()
This transaction should be treated as extremely high priority, taking priority over other transactions and bypassing controls on transaction queuing.
Warning
This is intended for the use of internal database functions and low-level tools; use by applications may result in severe database performance or availability problems.
- Transaction.options.set_causal_read_risky()
This transaction does not require the strict causal consistency guarantee that FoundationDB provides by default. The read version will be committed, and usually will be the latest committed, but might not be the latest committed in the event of a simultaneous fault and misbehaving clock. One can set this for all transactions by calling
Database.options.set_transaction_causal_read_risky()
.
- Transaction.options.set_causal_write_risky()
The application either knows that this transaction will be self-conflicting (at least one read overlaps at least one set or clear), or is willing to accept a small risk that the transaction could be committed a second time after its commit apparently succeeds. This option provides a small performance benefit.
- Transaction.options.set_next_write_no_write_conflict_range()
The next write performed on this transaction will not generate a write conflict range. As a result, other transactions which read the key(s) being modified by the next write will not necessarily conflict with this transaction.
Note
Care needs to be taken when using this option on a transaction that is shared between multiple threads. When setting this option, write conflict ranges will be disabled on the next write operation, regardless of what thread it is on.
- Transaction.options.set_read_your_writes_disable()
When this option is invoked, a read performed by a transaction will not see any prior mutations that occured in that transaction, instead seeing the value which was in the database at the transaction’s read version. This option may provide a small performance benefit for the client, but also disables a number of client-side optimizations which are beneficial for transactions which tend to read and write the same keys within a single transaction.
Note
It is an error to set this option after performing any reads or writes on the transaction.
- Transaction.options.set_read_ahead_disable()
Disables read-ahead caching for range reads. Under normal operation, a transaction will read extra rows from the database into cache if range reads are used to page through a series of data one row at a time (i.e. if a range read with a one row limit is followed by another one row range read starting immediately after the result of the first).
- Transaction.options.set_access_system_keys()
Allows this transaction to read and modify system keys (those that start with the byte
0xFF
).Warning
Writing into system keys will likely break your database. Further, even for readers, the format of data in the system keys may change from version to version in FoundationDB.
- Transaction.options.set_read_system_keys()
Allows this transaction to read system keys (those that start with the byte
0xFF
).Warning
The format of data in the system keys may change from version to version in FoundationDB.
- Transaction.options.set_retry_limit()
Set a maximum number of retries after which additional calls to
Transaction.on_error()
will throw the most recently seen error code. (By default, a transaction permits an unlimited number of retries.) Valid parameter values are [-1, INT_MAX]. If set to -1, the transaction returns to the default of unlimited retries.Prior to API version 610, Like all other transaction options, the retry limit must be reset after a call to
Transaction.on_error()
. If the API version is 610 or newer, then the retry limit is not reset. Note that at all API versions, it is safe and legal to call this option after each call toTransaction.on_error()
, so most code written assuming the older behavior can be upgraded without requiring any modification. This also means there is no need to introduce logic to conditionally set this option within retry loops. One can also set the default retry limit for all transactions by callingDatabase.options.set_transaction_retry_limit()
.
- Transaction.options.set_max_retry_delay()
Set the maximum backoff delay incurred in the call to
Transaction.on_error()
if the error is retryable. Prior to API version 610, like all other transaction options, the maximum retry delay must be reset after a call toTransaction.on_error()
. If the API version is 610 or newer, then the maximum retry delay is not reset. Note that at all API versions, it is safe and legal to call this option after each call toTransaction.on_error()
, so most cade written assuming the older behavior can be upgraded without requiring any modification. This also means there is no need to introduce logic to conditionally set this option within retry loops. One can set the default retry limit for all transactions by callingDatabase.options.set_transaction_max_retry_delay()
.
- Transaction.options.set_size_limit()
Set the transaction size limit in bytes. The size is calculated by combining the sizes of all keys and values written or mutated, all key ranges cleared, and all read and write conflict ranges. (In other words, it includes the total size of all data included in the request to the cluster to commit the transaction.) Large transactions can cause performance problems on FoundationDB clusters, so setting this limit to a smaller value than the default can help prevent the client from accidentally degrading the cluster’s performance. This value must be at least 32 and cannot be set to higher than 10,000,000, the default transaction size limit.
- Transaction.options.set_timeout()
Set a timeout duration in milliseconds after which the transaction automatically to be cancelled. The time is measured from transaction creation (or the most call to
reset
, if any). Valid parameter values are [0, INT_MAX]. If set to 0, all timeouts will be disabled. Once a transaction has timed out, all pending or future uses of the transaction will raise a transaction_timed_out exception. The transaction can be used again after it isreset
.Timeouts employ transaction cancellation, so you should note the issues raised by
Transaction.cancel()
when using timeouts.Prior to API version 610, like all other transaction options, a timeout must be reset after a call to
Transaction.on_error()
. Note that resetting this option resets only the timeout duration, not the starting point from which the time is measured. If the API version is 610 or newer, then the timeout is not reset. This allows the user to specify a timeout for specific transactions that is longer than the timeout specified byDatabase.options.set_transaction_timeout()
. Note that at all API versions, it is safe and legal to call this option after each call toTransaction.on_error()
, so most code written assuming the older behavior can be upgraded without requiring any modification. This also means that there is no need to introduce logic to conditionally set this option within retry loops. One can set the default timeout for all transactions by callingDatabase.options.set_transaction_timeout()
.
- Transaction.options.set_transaction_logging_max_field_length(size_limit)
Sets the maximum escaped length of key and value fields to be logged to the trace file via the LOG_TRANSACTION option, after which the field will be truncated. A negative value disables truncation. One can set the default max field length for all transactions by calling
Database.options.set_transaction_logging_max_field_length()
.
- Transaction.options.set_debug_transaction_identifier(id_string)
Sets a client provided string identifier for the transaction that will be used in scenarios like tracing or profiling. Client trace logging or transaction profiling must be separately enabled.
- Transaction.options.set_log_transaction()
Enables tracing for this transaction and logs results to the client trace logs. The DEBUG_TRANSACTION_IDENTIFIER option must be set before using this option, and client trace logging must be enabled to get log output.
Future objects
Many FoundationDB API functions return “future” objects. A brief overview of futures is included in the class scheduling tutorial. Most future objects behave just like a normal object, but block when you use them for the first time if the asynchronous function which returned the future has not yet completed its action. A future object is considered ready when either a value is available, or when an error has occurred.
When a future object “blocks”, what actually happens is determined by the event model. A threaded program will block a thread, but a program using the gevent model will block a greenlet.
All future objects are a subclass of the Future
type.
- class fdb.Future
- Future.wait()
Blocks until the object is ready, and returns the object value (or raises an exception if the asynchronous function failed).
- Future.is_ready()
Immediately returns true if the future object is ready, false otherwise.
- Future.block_until_ready()
Blocks until the future object is ready.
- Future.on_ready(callback)
Calls the specified callback function, passing itself as a single argument, when the future object is ready. If the future object is ready at the time
on_ready()
is called, the call may occur immediately in the current thread (although this behavior is not guaranteed). Otherwise, the call may be delayed and take place on the thread with which the client was initialized. Therefore, the callback is responsible for any needed thread synchronization (and/or for posting work to your application’s event loop, thread pool, etc., as may be required by your application’s architecture).Note
This function guarantees the callback will be executed at most once.
Warning
There are a number of requirements and constraints to be aware of when using callbacks with FoundationDB. Please read Programming with futures.
- Future.cancel()
Cancels a future and its associated asynchronous operation. If called before the future is ready, attempts to access its value will raise an operation_cancelled exception. Cancelling a future which is already ready has no effect. Note that even if a future is not ready, its associated asynchronous operation may have succesfully completed and be unable to be cancelled.
- static Future.wait_for_any(*futures)
Does not return until at least one of the given future objects is ready. Returns the index in the parameter list of a ready future object.
Asynchronous methods return one of the following subclasses of Future
:
- class fdb.Value
Represents a future string object and responds to the same methods as string in Python. They may be passed to FoundationDB methods that expect a string.
- Value.present()
Returns
False
if the key used to request this value was not present in the database. For example:@fdb.transactional def foo(tr): val = tr[b'foo'] if val.present(): print 'Got value: %s' % val else: print 'foo was not present'
- class fdb.Key
Represents a future string object and responds to the same methods as string in Python. They may be passed to FoundationDB methods that expect a string.
- class fdb.FutureInt64
Represents a future integer. You must call the
Future.wait()
method on this object to retrieve the integer.
- class fdb.FutureStringArray
Represents a future list of strings. You must call the
Future.wait()
method on this object to retrieve the list of strings.
- class fdb.FutureVoid
Represents a future returned from asynchronous methods that logically have no return value.
For a
FutureVoid
object returned byTransaction.commit()
orTransaction.on_error()
, you must call theFuture.wait()
method, which will either raise an exception if an error occurred during the asynchronous call, or do nothing and returnNone
.
Streaming modes
- fdb.StreamingMode
When using Transaction.get_range()
and similar interfaces, API clients can request large ranges of the database to iterate over. Making such a request doesn’t necessarily mean that the client will consume all of the data in the range - sometimes the client doesn’t know how far it intends to iterate in advance. FoundationDB tries to balance latency and bandwidth by requesting data for iteration in batches.
Streaming modes permit the API client to customize this performance tradeoff by providing extra information about how the iterator will be used.
The following streaming modes are available:
- StreamingMode.iterator
The default. The client doesn’t know how much of the range it is likely to used and wants different performance concerns to be balanced.
Only a small portion of data is transferred to the client initially (in order to minimize costs if the client doesn’t read the entire range), and as the caller iterates over more items in the range larger batches will be transferred in order to maximize throughput.
- StreamingMode.want_all
The client intends to consume the entire range and would like it all transferred as early as possible.
- StreamingMode.small
Infrequently used. Transfer data in batches small enough to not be much more expensive than reading individual rows, to minimize cost if iteration stops early.
- StreamingMode.medium
Infrequently used. Transfer data in batches sized in between small and large.
- StreamingMode.large
Infrequently used. Transfer data in batches large enough to be, in a high-concurrency environment, nearly as efficient as possible. If the client stops iteration early, some disk and network bandwidth may be wasted. The batch size may still be too small to allow a single client to get high throughput from the database, so if that is what you need consider
StreamingMode.serial
.
- StreamingMode.serial
Transfer data in batches large enough that an individual client can get reasonable read bandwidth from the database. If the client stops iteration early, considerable disk and network bandwidth may be wasted.
- StreamingMode.exact
Infrequently used. The client has passed a specific row limit and wants that many rows delivered in a single batch. This is not particularly useful in Python because iterator functionality makes batches of data transparent, so use
StreamingMode.want_all
instead.
Event models
By default, the FoundationDB Python API assumes that the calling program uses threads (as provided by the threading
module) for concurrency. This means that blocking operations will block the current Python thread. This behavior can be changed by specifying the optional event_model
parameter to the open()
function.
The following event models are available:
event_model=None
The default. Blocking operations will block the current Python thread. This is also fine for programs without any form of concurrency.
event_model="gevent"
The calling program uses the gevent module for single-threaded concurrency. Blocking operations will block the current greenlet.
The FoundationDB Python API has been tested with gevent versions 0.13.8 and 1.0rc2 and should work with all gevent 0.13 and 1.0 releases.
Note
The
gevent
event model on Windows requires gevent 1.0 or newer.event_model="debug"
The calling program is threaded, but needs to be interruptible (by Ctrl-C). Blocking operations will poll, effectively blocking the current thread but responding to keyboard interrupts. This model is inefficient, but can be very useful for debugging.
Errors
Errors in the FoundationDB API are raised as exceptions of type FDBError
. These errors may be displayed for diagnostic purposes, but generally should be passed to Transaction.on_error()
. When using @fdb.transactional
, appropriate errors will be retried automatically.
- class fdb.FDBError
- FDBError.code
An integer associated with the error type.
- FDBError.description
A somewhat human-readable description of the error.
Warning
You should use only FDBError.code
for programmatic comparisons, as the description of the error may change at any time. Whenever possible, use the Transaction.on_error()
method to handle FDBError
exceptions.
Tuple layer
The FoundationDB API comes with a built-in layer for encoding tuples into keys usable by FoundationDB. The encoded key maintains the same sort order as the original tuple: sorted first by the first element, then by the second element, etc. This makes the tuple layer ideal for building a variety of higher-level data models.
Note
For general guidance on tuple usage, see the discussion in the document on Data Modeling.
The tuple layer in the FoundationDB Python API supports tuples that contain elements of the following data types:
Type |
Legal Values |
---|---|
|
Any |
Byte string |
Any |
Unicode string |
Any |
Integer |
Python 2.7: Any |
Floating point number (single-precision) |
Any |
Floating point number (double-precision) |
Any |
Boolean |
Any |
UUID |
Any |
Versionstamp |
Any |
Tuple or List |
Any |
If T
is a Python tuple meeting these criteria, then:
fdb.tuple.compare(T, fdb.tuple.unpack(fdb.tuple.pack(T))) == 0
That is, any tuple meeting these criteria will have the same semantic value if serialized and deserialized. For
the most part, this also implies that T == fdb.tuple.unpack(fdb.tuple.pack(T))
with the following caveats:
Any
value
of typectypes.c_double
is converted to the Pythonfloat
type, butvalue.value == fdb.tuple.unpack(fdb.tuple.pack((value,)))[0]
will be true (as long asvalue
is not NaN).Any
value
of typectypes.c_float
is converted into afdb.tuple.SingleFloat
instance, butvalue.value == fdb.tuple.unpack(fdb.tuple.pack((value,)))[0].value
will be true (as long asvalue.value
is not NaN).Any
value
of typelist
ortuple
is converted to atuple
type where the elements of the serialized and deserializedvalue
will be equal (subject to these caveats) to the elements of the originalvalue
.
import fdb.tuple
Imports the FoundationDB tuple layer.
- fdb.tuple.pack(tuple, prefix=b'')
Returns a key (byte string) encoding the specified tuple. If
prefix
is set, it will prefix the serialized bytes with the prefix string. This throws an error if any of the tuple’s items are incompleteVersionstamp
instances.
- fdb.tuple.pack_with_versionstamp(tuple, prefix=b'')
Returns a key (byte string) encoding the specified tuple. This method will throw an error unless exactly one of the items of the tuple is an incomplete
Versionstamp
instance. (It will recurse down nested tuples if there are any to find one.) If so, it will produce a byte string that can be fed intofdb.Transaction.set_versionstamped_key()
and correctly fill in the versionstamp information at commit time so that when the key is re-read and deserialized, the only difference is that theVersionstamp
instance is complete and has the transaction version filled in. This throws an error if there are no incompleteVersionstamp
instances in the tuple or if there is more than one.
- fdb.tuple.unpack(key)
Returns the tuple encoded by the given key.
- fdb.tuple.has_incomplete_versionstamp(tuple)
Returns
True
if there is at least one element contained within the tuple that is aVersionstamp
instance that is incomplete. If there are multiple incompleteVersionstamp
instances, this method will returnTrue
, but trying to pack it into a byte string will result in an error.
- fdb.tuple.range(tuple)
Returns a Python slice object representing all keys that encode tuples strictly starting with
tuple
(that is, all tuples of greater length than tuple of which tuple is a prefix).Can be used to directly index a Transaction object to retrieve a range. For example:
tr[ fdb.tuple.range(('A',2)) ]
returns all key-value pairs in the database whose keys would unpack to tuples like (‘A’, 2, x), (‘A’, 2, x, y), etc.
- fdb.tuple.compare(tuple1, tuple2)
Compares two tuples in a way that respects the natural ordering of the elements within the tuples. It will return -1 if
tuple1
would sort beforetuple2
when performing an element-wise comparison of the two tuples, it will return 1 iftuple1
would sort aftertuple2
, and it will return 0 if the two tuples are equivalent. If the function must compare two elements of different types while doing the comparison, it will sort the elements based on their internal type codes, so comparisons are consistent if not necessarily semantically meaningful. Strings are sorted on their byte representation when encoded into UTF-8 (which may differ from the default sort when non-ASCII characters are included within the string), and UUIDs are sorted based on their big-endian byte representation. Single-precision floating point numbers are sorted before all double-precision floating point numbers, and for floating point numbers, -NaN is sorted before -Infinity which is sorted before finite numbers which are sorted before Infinity which is sorted before NaN. Different representations of NaN are not treated as equal.Additionally, the tuple serialization contract is such that after they are serialized, the byte-string representations of
tuple1
andtuple2
will sort in a manner that is consistent with this function. In particular, this function obeys the following contract:fdb.tuple.compare(tuple1, tuple2) == -1 if fdb.tuple.pack(tuple1) < fdb.tuple.pack(tuple2) else \ 0 if fdb.tuple.pack(tuple2) == fdb.tuple.pack(tuple2) else 1
As byte order is the comparator used within the database, this comparator can be used to determine the order of keys within the database.
- class fdb.tuple.SingleFloat(value)
Wrapper around a single-precision floating point value. When constructed, the
value
parameter should either be an integral value, afloat
, or actypes.c_float
. It will then properly store the value in itsSingleFloat.value
field (which should not be mutated). If the float does not fit within a IEEE 754 floating point integer, there may be a loss of precision.
- SingleFloat.value
The underlying value of the
SingleFloat
object. This will have typefloat
.
- SingleFloat.__eq__(other)
- SingleFloat.__ne__(other)
- SingleFloat.__lt__(other)
- SingleFloat.__le__(other)
- SingleFloat.__gt__(other)
- SingleFloat.__ge__(other)
Comparison functions for
SingleFloat
objects. This will sort according to the byte representation of the object rather than using standard float comparison. In particular, this means that-0.0 != 0.0
and that theNaN
values will sort in a way that is consistent with thecompare()
method between tuples rather than using standard floating-point comparison.
- class fdb.tuple.Versionstamp(tr_version=None, user_version=0)
Used to represent values written by versionstamp operations within the tuple layer. This wraps a single byte array of length 12 that can be used to represent some global order of items within the database. These versions are composed of two separate components: (1) the 10-byte
tr_version
and (2) the two-byteuser_version
. Thetr_version
is set by the database, and it is used to impose an order between different transactions. This order is guaranteed to be monotonically increasing over time for a given database. (In particular, it imposes an order that is consistent with a serialization order of the database’s transactions.) If the client elects to leave thetr_version
as its default value ofNone
, then theVersionstamp
is considered “incomplete”. This will cause the first 10 bytes of the serializedVersionstamp
to be filled in with dummy bytes when serialized. When used withfdb.Transaction.set_versionstamped_key()
, an incomplete version can be used to ensure that a key gets written with the current transaction’s version which can be useful for maintaining append-only data structures within the database. If thetr_version
is set to something that is notNone
, it should be set to a byte array of length 10. In this case, theVersionstamp
is considered “complete”. This is the usual case when one reads a serializedVersionstamp
from the database.The
user_version
should be specified as an integer, but it must fit within a two-byte unsigned integer. It is set by the client, and it is used to impose an order between items serialized within a single transaction. If left unset, then final two bytes of the serializedVersionstamp
are filled in with a default (constant) value.Sample usage of this class might be something like this:
@fdb.transactional def write_versionstamp(tr, prefix): tr.set_versionstamped_key(fdb.tuple.pack_with_versionstamp((prefix, fdb.tuple.Versionstamp())), b'') return tr.get_versionstamp() @fdb.transactional def read_versionstamp(tr, prefix): subspace = fdb.Subspace((prefix,)) for k, _ in tr.get_range(subspace.range().start, subspace.range().stop, 1): return subspace.unpack(k)[0] return None db = fdb.open() del db[fdb.tuple.range(('prefix',))] tr_version = write_versionstamp(db, 'prefix').wait() v = read_versionstamp(db, 'prefix') assert v == fdb.tuple.Versionstamp(tr_version=tr_version)
Here, we serialize an incomplete
Versionstamp
and then write it using theset_versionstamped_key
mutation so that it picks up the transaction’s version information. Then when we read it back, we get a completeVersionstamp
with the committed transaction’s version.
- Versionstamp.tr_version
The inter-transaction component of the
Versionstamp
class. It should be eitherNone
(to indicate an incompleteVersionstamp
that will set the version later) or to some 10 byte value indicating the commit version and batch version of some transaction.
- Versionstamp.user_version
The intra-transaction component of the
Versionstamp
class. It should be some number that can fit within two bytes (i.e., between 0 and 65,535 inclusive). It can be used to impose an order between items that are committed together in the same transaction. If left unset, then the versionstamp is assigned a (constant) default user version value.
- Versionstamp.from_bytes(bytes)
Static initializer for
Versionstamp
instances that takes a serializedVersionstamp
and creates an instance of the class. Thebytes
parameter should be a byte string of length 12. This method will serialize the version as a “complete”Versionstamp
unless the dummy bytes are equal to the default transaction version assigned to incompleteVersionstamps
.
- Versionstamp.is_complete()
Returns whether this version has been given a (non-
None
)tr_version
or not.
- Versionstamp.completed(tr_version)
If this
Versionstamp
is incomplete, this returns a copy of this instance except that thetr_version
is filled in with the passed parameter. If theVersionstamp
is already complete, it will raise an error.
- Versionstamp.to_bytes()
Produces a serialized byte string corresponding to this versionstamp. It will have length 12 and will combine the
tr_version
anduser_version
to produce a byte string that lexicographically sorts appropriately with otherVersionstamp
instances. If this instance is incomplete, then thetr_version
component gets filled in with dummy bytes that will cause it to sort after every completeVerionstamp
’s serialized bytes.
- Versionstamp.__eq__(other)
- Versionstamp.__ne__(other)
- Versionstamp.__lt__(other)
- Versionstamp.__le__(other)
- Versionstamp.__gt__(other)
- Versionstamp.__ge__(other)
Comparison functions for
Versionstamp
objects. For two completeVersionstamps
, the ordering is first lexicographically bytr_version
and then byuser_version
. IncompleteVersionstamps
are defined to sort after all completeVersionstamps
(the idea being that for a given transaction, if aVersionstamp
has been created as the result of some prior transaction’s work, then the incompleteVersionstamp
, when assigned a version, will be assigned a greater version than the existing one), and for two incompleteVersionstamps
, the order is byuser_version
only.
Subspaces
Subspaces provide a convenient way to use the tuple layer to define namespaces for different categories of data. The namespace is specified by a prefix tuple which is prepended to all tuples packed by the subspace. When unpacking a key with the subspace, the prefix tuple will be removed from the result.
As a best practice, API clients should use at least one subspace for application data.
Note
For general guidance on subspace usage, see the discussion in the Developer Guide.
- class fdb.Subspace(prefixTuple=tuple(), rawPrefix='')
Creates a subspace with the specified prefix tuple. If the raw prefix byte string is specified, then it will be prepended to all packed keys. Likewise, the raw prefix will be removed from all unpacked keys.
- Subspace.key()
Returns the key encoding the prefix used for the subspace. This is equivalent to packing the empty tuple.
- Subspace.pack(tuple=tuple())
Returns the key encoding the specified tuple in the subspace. For example, if you have a subspace with prefix tuple
('users')
and you use it to pack the tuple('Smith')
, the result is the same as if you packed the tuple('users', 'Smith')
with the tuple layer.
- Subspace.pack_with_versionstamp(tuple)
Returns the key encoding the specified tuple in the subspace so that it may be used as the key in the
fdb.Transaction.set_versionstampe_key()
method. The passed tuple must contain exactly one incompletefdb.tuple.Versionstamp
instance or the method will raise an error. The behavior here is the same as if one used thefdb.tuple.pack_with_versionstamp()
method to appropriately pack together this subspace and the passed tuple.
- Subspace.unpack(key)
Returns the tuple encoded by the given key, with the subspace’s prefix tuple and raw prefix removed.
- Subspace.range(tuple=tuple())
Returns a range representing all keys in the subspace that encode tuples strictly starting with the specifed tuple.
The range will be returned as a Python slice object, and may be used with any FoundationDB methods that require a range:
r = subspace.range(('A', 2)) rng_itr1 = tr[r] rng_itr2 = tr.get_range(r.start, r.stop, limit=1)
- Subspace.contains(key)
Returns true if
key
starts withSubspace.key()
, indicating that the subspace logically containskey
.
- Subspace.as_foundationdb_key()
Returns the key encoding the prefix used for the subspace, like
Subspace.key()
. This method serves to support the as_foundationdb_key() convenience interface.
- Subspace.subspace(tuple)
Returns a new subspace which is equivalent to this subspace with its prefix tuple extended by the specified tuple.
x = subspace[item]
Shorthand for
x = subspace.subspace((item,))
. This function can be combined with theSubspace.as_foundationdb_key()
convenience to turn this:s = fdb.Subspace(('x',)) tr[s.pack(('foo', 'bar', 1))] = ''
into this:
s = fdb.Subspace(('x',)) tr[s['foo']['bar'][1]] = ''
Directories
The FoundationDB API provides directories as a tool for managing related subspaces. Directories are a recommended approach for administering applications. Each application should create or open at least one directory to manage its subspaces.
Note
For general guidance on directory usage, see the discussion in the Developer Guide.
Directories are identified by hierarchical paths analogous to the paths in a Unix-like file system. A path is represented as a tuple of strings. Each directory has an associated subspace used to store its content. The directory layer maps each path to a short prefix used for the corresponding subspace. In effect, directories provide a level of indirection for access to subspaces.
Except where noted, directory methods interpret the provided path(s) relative to the path of the directory object. When opening a directory, a byte string layer
option may be specified as a metadata identifier.
- fdb.directory
The default instance of
DirectoryLayer
.
- class fdb.DirectoryLayer(node_subspace=Subspace(rawPrefix='þ'), content_subspace=Subspace(), allow_manual_prefixes=False)
Each instance defines a new root directory. The subspaces
node_subspace
andcontent_subspace
control where the directory metadata and contents, respectively, are stored. The default root directory has anode_subspace
with raw prefix\xFE
and acontent_subspace
with no prefix. Specifying more restrictive values fornode_subspace
andcontent_subspace
will allow using the directory layer alongside other content in a database. Ifallow_manual_prefixes
is false, attempts to create a directory with a manual prefix under the directory layer will raise an exception. The default root directory does not allow manual prefixes.
- DirectoryLayer.create_or_open(tr, path, layer=None)
Opens the directory with
path
specified as a tuple of strings.path
can also be a string, in which case it will be automatically wrapped in a tuple. All string values in a path will be converted to unicode. If the directory does not exist, it is created (creating parent directories if necessary).If the byte string
layer
is specified and the directory is new, it is recorded as the layer; iflayer
is specified and the directory already exists, it is compared against the layer specified when the directory was created, and the method will raise an exception if they differ.Returns the directory and its contents as a DirectorySubspace.
- DirectoryLayer.open(tr, path, layer=None)
Opens the directory with
path
specified as a tuple of strings.path
can also be a string, in which case it will be automatically wrapped in a tuple. All string values in a path will be converted to unicode. The method will raise an exception if the directory does not exist.If the byte string
layer
is specified, it is compared against the layer specified when the directory was created, and the method will raise an exception if they differ.Returns the directory and its contents as a DirectorySubspace.
- DirectoryLayer.create(tr, path, layer=None, prefix=None)
Creates a directory with
path
specified as a tuple of strings.path
can also be a string, in which case it will be automatically wrapped in a tuple. All string values in a path will be converted to unicode. Parent directories are created if necessary. The method will raise an exception if the given directory already exists.If the byte string
prefix
is specified, the directory is created with the given physical prefix; otherwise a prefix is allocated automatically.If the byte string
layer
is specified, it is recorded with the directory and will be checked by future calls to open.Returns the directory and its contents as a DirectorySubspace.
- DirectoryLayer.move(tr, old_path, new_path)
Moves the directory at
old_path
tonew_path
. There is no effect on the physical prefix of the given directory or on clients that already have the directory open. The method will raise an exception if a directory does not exist atold_path
, a directory already exists atnew_path
, or the parent directory ofnew_path
does not exist.Returns the directory at its new location as a DirectorySubspace.
- DirectoryLayer.remove(tr, path)
Removes the directory at
path
, its contents, and all subdirectories. The method will raise an exception if the directory does not exist.Warning
Clients that have already opened the directory might still insert data into its contents after removal.
- DirectoryLayer.remove_if_exists(tr, path)
Checks if the directory at
path
exists and, if so, removes the directory, its contents, and all subdirectories. Returnstrue
if the directory existed andfalse
otherwise.Warning
Clients that have already opened the directory might still insert data into its contents after removal.
- DirectoryLayer.list(tr, path=())
Returns a list of names of the immediate subdirectories of the directory at
path
. Each name is a unicode string representing the last component of a subdirectory’s path.
- DirectoryLayer.exists(tr, path)
Returns
true
if the directory atpath
exists andfalse
otherwise.
- DirectoryLayer.get_layer()
Returns the layer specified when the directory was created.
- DirectoryLayer.get_path()
Returns the path with which the directory was opened.
DirectorySubspace
A directory subspace represents a specific directory and its contents. It stores the path
with which it was opened and supports all DirectoryLayer
methods for operating on itself and its subdirectories. It also implements all Subspace
methods for working with the contents of that directory.
- DirectorySubspace.move_to(tr, new_path)
Moves this directory to
new_path
, interpretingnew_path
absolutely. There is no effect on the physical prefix of the given directory or on clients that already have the directory open. The method will raise an exception if a directory already exists atnew_path
or the parent directory ofnew_path
does not exist.Returns the directory at its new location as a DirectorySubspace.
Locality information
The FoundationDB API comes with a set of functions for discovering the storage locations of keys within your cluster. This information can be useful for advanced users who wish to take into account the location of keys in the design of applications or processes.
- fdb.locality.get_boundary_keys(db_or_tr, begin, end)
Returns a generator of keys
k
such thatbegin <= k < end
andk
is located at the start of a contiguous range stored on a single server.The first parameter to this function may be either a
Database
or aTransaction
. If it is passed aTransaction
, the transaction will not be committed, reset, or modified in any way, nor will its transaction options (such as retry limit) be applied within the function. However, if the database is unavailable prior to the function call, any timeout set on the transaction will still trigger.Like a Future object, the returned container issues asynchronous read operations to fetch the data in the range and may block while iterating over its values if the read has not completed.
This method is not transactional. It will return an answer no older than the Transaction or Database object it is passed, but the returned boundaries are an estimate and may not represent the exact boundary locations at any database version.
- fdb.locality.get_addresses_for_key(tr, key)
Returns a
fdb.FutureStringArray
. You must call thefdb.Future.wait()
method on this object to retrieve a list of public network addresses as strings, one for each of the storage servers responsible for storingkey
and its associated value.
Tenant management
The FoundationDB API includes functions to manage the set of tenants in a cluster.
- fdb.tenant_management.create_tenant(db_or_tr, tenant_name)
Creates a new tenant in the cluster.
The tenant name can be either a byte string or a tuple and cannot start with the
\xff
byte. If a tuple is provided, the tuple will be packed using the tuple layer to generate the byte string tenant name.If a database is provided to this function for the
db_or_tr
parameter, then this function will first check if the tenant already exists. If it does, it will fail with atenant_already_exists
error. Otherwise, it will create a transaction and attempt to create the tenant in a retry loop. If the tenant is created concurrently by another transaction, this function may still return successfully.If a transaction is provided to this function for the
db_or_tr
parameter, then this function will not check if the tenant already exists. It is up to the user to perform that check if required. The user must also successfully commit the transaction in order for the creation to take effect.
- fdb.tenant_management.delete_tenant(db_or_tr, tenant_name)
Delete a tenant from the cluster.
The tenant name can be either a byte string or a tuple. If a tuple is provided, the tuple will be packed using the tuple layer to generate the byte string tenant name.
It is an error to delete a tenant that still has data. To delete a non-empty tenant, first clear all of the keys in the tenant.
If a database is provided to this function for the
db_or_tr
parameter, then this function will first check if the tenant already exists. If it does not, it will fail with atenant_not_found
error. Otherwise, it will create a transaction and attempt to delete the tenant in a retry loop. If the tenant is deleted concurrently by another transaction, this function may still return successfully.If a transaction is provided to this function for the
db_or_tr
parameter, then this function will not check if the tenant already exists. It is up to the user to perform that check if required. The user must also successfully commit the transaction in order for the deletion to take effect.