Release Notes
7.0.0
Features
First release of the Redwood Storage Engine, a BTree storage engine with higher throughput and lower write amplification than SQLite. See Redwood Storage Engine Documentation for details.
Replaced committed version broadcast through proxy with centralizing live committed versions into master. (PR #3307)
Added a new API in all bindings that can be used to get a list of split points that will split the given range into (roughly) equally sized chunks. (PR #3394)
Introduced a new role called GRV proxy specialized for serving GRV requests to decrease GRV tail latency since we prioritize commit paths over GRV in the current proxy. The original proxy is renamed to Commit proxy. (PR #3549) (PR #3772)
Added support for writing backup files directly to Azure blob storage. This is not yet performance tested on large-scale clusters. (PR #3961)
Tag-based throttling now also takes the write path into account. (PR #3512)
Added the ability to ratekeeper to throttle certain types of tags based on write hot spots in addition to read hot spots. (PR #3571)
Users now have the option to make ratekeeper recommend which transaction tags should be throttled, but not actually throttle them using fdbcli. (PR #3669)
Added a new
--build_flags option
to binaries to print build information. (PR #3769)Added
--incremental
option to backup and restore that allows specification of only recording mutation log files and not range files. Incremental restore also allows restoring to a non-empty destination database. (PR #3676)Added a tracing framework to track request latency through each FDB component. See Documentation for details. (PR #3329)
Added the (Global Configuration Framework), an eventually consistent configuration mechanism to efficiently make runtime changes to all clients and servers. (PR #4330)
Added the ability to monitor and manage an fdb cluster via read/write specific special keys through transactions. See Documentation for details. (PR #3455)
Added TLS support to fdbdecode for decoding mutation log files stored in blobs. (PR #4611)
Added
initial_snapshot_interval
to fdbbackup that can specify the duration of the first inconsistent snapshot written to the backup. (PR #4620)Added
inconsistent_snapshot_only
to fdbbackup that ignores mutation log files and only uses range files during the restore to speedup the process. (PR #4704)Added the Testing Storage Server (TSS), which allows FoundationDB to run an “untrusted” storage engine with identical workload to the current storage engine, with zero impact on durability or correctness, and minimal impact on performance. (Documentation) (PR #4556)
Added perpetual storage wiggle that supports less impactful B-trees recreation and data migration. These will also be used for deploying the Testing Storage Server which compares 2 storage engines’ results. See Documentation for details. (PR #4838)
Improved the efficiency with which storage servers replicate data between themselves. (PR #5017)
Added support to
exclude command
to exclude based on locality match. (PR #5113)Add the
trace_partial_file_suffix
network option. This option will give unfinished trace files a special suffix to indicate they’re not complete yet. When the trace file is complete, it is renamed to remove the suffix. (PR #5328)
Performance
Improved Deque copy performance. (PR #3197)
Increased performance of dr_agent when copying the mutation log. The
COPY_LOG_BLOCK_SIZE
,COPY_LOG_BLOCKS_PER_TASK
,COPY_LOG_PREFETCH_BLOCKS
,COPY_LOG_READ_AHEAD_BYTES
andCOPY_LOG_TASK_DURATION_NANOS
knobs can be set. (PR #3436)Added multiple new microbenchmarks for PromiseStream, Reference, IRandom, and timer, as well as support for benchmarking actors. (PR #3590)
Use xxhash3 for SQLite page checksums. (PR #4075)
fdbserver now uses jemalloc on Linux instead of the system malloc. (PR #4222)
Watches have been optimized and are now significantly cheaper. (PR #4266) (PR #4382 )
The Coro library has been replaced with boost::coroutine2. (PR #4242)
Reduce CPU overhead of load balancing on client processes. (PR #4561)
Used the restored key range to filter out files for faster restore. (PR #4568)
Transaction log files will be truncated by default if they are under 2GB in size. (PR #4656)
Reduced the number of connections required by the multi-version client when loading external clients. When connecting to 7.0 clusters, only one connection with version 6.2 or larger will be used. With older clusters, at most two connections with version 6.2 or larger will be used. Clients older than version 6.2 will continue to create an additional connection each. (PR #4667)
Reliability
Backup agents now pause themselves upon a successful snapshot recovery to avoid unintentional data corruption. Operators should manually abort backup agents and clear the backup agent keyspace to avoid using the old cluster’s backup configuration. (PR #4027)
Log class processes are prioritized above transaction class proceses for becoming tlogs. (PR #4509)
Improved worker recruitment logic to avoid unnecessary recoveries when processes are added or removed from a cluster. (PR #4695) (PR #4631) (PR #4509)
Fixes
List files asynchronously so many backup files on a slow disk won’t cause the backup agent to lose its lease. (PR #3094)
Unknown endpoint has been tracked incorrectly and therefore showed up too frequently in our statistics. (PR #4473)
Using the
exclude failed
command could leave the data distributor in a state where it cannot complete relocations. (PR #4495)Fixed a rare crash on the cluster controller when using multi-region configurations. (PR #4547)
Fixed a memory corruption bug in the data distributor. (PR #4535)
Fixed a rare crash that could happen on the sequencer during recovery. (PR #4548)
Added a new pre-backup action when creating a backup. Backups can now either verify the range data is being saved to is empty before the backup begins (current behavior) or clear the range where data is being saved to. Fixes a
restore_destination_not_empty
failure after a backup retry due tocommit_unknown_failure
. (PR #4595)When configured with
usable_regions=2
, a cluster would not fail over to a region which contained only storage class processes. (PR #4599)If a restore is done using a prefix to remove and specific key ranges to restore, the key range boundaries must begin with the prefix to remove. (PR #4684)
The multi-version client API would not propagate errors that occurred when creating databases on external clients. This could result in a invalid memory accesses. (PR #5220)
Fixed a race between the multi-version client connecting to a cluster and destroying the database that could cause an assertion failure. (PR #5220)
A client might not honor transaction timeouts when using the multi-version client if it cannot connect to the cluster. (Issue #5595)
Status
Added
cluster.qos.throttled_tags
andcluster.processes.*.roles.busiest_[read|write]_tag
to report statistics on throttled tags and the busiest read or write transaction tags on each storage server. (PR #3669) (PR #3696)Added
seconds_since_last_recovered
to thecluster.recovery_state
section to report how long it has been since the cluster recovered to the point where it is able to accept requests. (PR #3759)Added limiting metrics (limiting_storage_durability_lag and limiting_storage_queue) to health metrics. (PR #4067)
min_replicas_remaining
is now populated for all regions, thus giving a clear picture of the data replicas that exist in the database. (PR 4515)Added detailed metrics for batched transactions. (PR #4540)
Added
commit_batching_window_size
to the proxy roles section of status to record statistics about commit batching window size on each proxy. (PR #4735)Added
cluster.bounce_impact
section to status to report if there will be any extra effects when bouncing the cluster, and if so, the reason for those effects. (PR #4770)Added
fetched_versions
to the storage metrics section of status to report how fast a storage server is catching up in versions. (PR #4770)Added
fetches_from_logs
to the storage metrics section of status to report how frequently a storage server fetches updates from transaction logs. (PR #4770)
Bindings
Python: The function
get_estimated_range_size_bytes
will now throw an error if thebegin_key
orend_key
isNone
. (PR #3394)C: Added a function,
fdb_database_reboot_worker
, to reboot or suspend the specified process. (PR #4094)C: Added a function,
fdb_database_force_recovery_with_data_loss
, to force the database to recover into the given datacenter. (PR #4220)C: Added a function,
fdb_database_create_snapshot
, to create a snapshot of the database. (PR #4241)C: Added
fdb_database_get_main_thread_busyness
function to report how busy a client’s main thread is. (PR #4504)Java: Added
Database.getMainThreadBusyness
function to report how busy a client’s main thread is. (PR #4564)
Other Changes
Added rte_memcpy from DPDK for default usage. (PR #3089)
When
fdbmonitor
dies, all of its child processes are now killed. (PR #3841)The
foundationdb
service installed by the RPM packages will now automatically restartfdbmonitor
after 60 seconds when it fails. (PR #3841)Capture output of forked snapshot processes in trace events. (PR #4254)
Add ErrorKind field to Severity 40 trace events. (PR #4741)
Added histograms for the storage server write path components. (PR #5021)
Committing a transaction will no longer partially reset it as of API version 700. (PR #5271)