Fixed an issue where storage servers could shutdown with
unknown_error. (PR #4437)
Fix backup agent stall when writing to local filesystem with slow metadata operations. (PR #4428)
Backup agent no longer uses 4k block caching layer on local output files so that write operations are larger. (PR #4428)
Fix accounting error that could cause commits to incorrectly fail with
proxy_memory_limit_exceeded. (PR #4529)
Fix an issue where symbolic links in cmake-built RPMs are broken if you unpack the RPM to a custom directory. (PR #4380)
A storage server which has fallen behind will deprioritize reads in order to catch up. This change causes some saturating workloads to experience high read latencies instead of high GRV latencies. (PR #4218)
processes.rolessection of status to record the number of deprioritized reads on each storage server. (PR #4218)
workload.operationssection of status to record the total number of deprioritized reads. (PR #4218)
Backup to locally mounted filesystems now appends to files in large block writes, 1MB each by default. (PR #4199)
Changed the default SSL implementation from OpenSSL to BoringSSL (PR #4153)
SQLite now supports configurable disk write rate limiting. (PR #4259)
If a disk operation takes more than two minutes, the system will treat the disk as failed. (PR #4243)
Fix invalid memory access on data distributor when snapshotting large clusters. (PR #4076)
Add human-readable DateTime to trace events (PR #4087)
Proxy rejects transaction batch that exceeds MVCC window (PR #4113)
Add a command in fdbcli to manually trigger the detailed teams information loggings in data distribution. (PR #4060)
Add documentation on read and write Path. (PR #4099)
Add a histogram to expose commit batching window on Proxies. (PR #4166)
Fix double counting of range reads in TransactionMetrics. (PR #4130)
Add a trace event that can be used as an indicator of the load on the proxy. (PR #4166)
Fixed undefined behavior in configuring supported FoundationDB versions while starting up a client. (PR #3849)
Updated OpenSSL to version 1.1.1h. (PR #3809)
Attempt to detect when calling
fdb_future_block_until_ready()would cause a deadlock, and throw
blocked_from_network_threadif it would definitely cause a deadlock. (PR #3786)
Mitigate an issue where a non-lockaware transaction that changes certain
\xff“metadata” keys, committed concurrently with locking the database, can cause corruption. If a non-lockaware transaction manually sets its read version to a version where the database is locked, and changes metadata keys, this can still cause corruption. (PR #3674)
Reset network connections between the proxies and satellite tlogs if the latencies are larger than 500ms. (PR #3686)
fdbcliwhich kills a process and prevents it from rejoining the cluster for a specified duration. (PR #3550)
When configured with
usable_regions=2data distribution could temporarily lower the replication of a shard when moving it. (PR #3487)
Prevent data distribution from running out of memory by fetching the source servers for too many shards in parallel. (PR #3487)
Reset network connections between log routers and satellite tlogs if the latencies are larger than 500ms. (PR #3487)
Added per-process server request latency statistics reported in the role section of relevant processes. These are named
commit_latency_statisticson proxy roles and
read_latency_statisticson storage roles. (PR #3480)
cluster.active_primary_dcthat indicates which datacenter is serving as the primary datacenter in multi-region setups. (PR #3320)
HTTPS requests made by backup could hang indefinitely. (PR #3027)
fdbrestoreprefix options required exactly a single hyphen instead of the standard two. (PR #3056)
Commits could stall on a newly elected proxy because of inaccurate compute estimates. (PR #3123)
A transaction class process with a bad disk could be repeatedly recruited as a transaction log. (PR #3268)
Fix a potential race condition that could lead to undefined behavior when connecting to a database using the multi-version client API. (PR #3265)
fdbcliwhich returns the current read version of the cluster. (PR #2882)
fdbcliwhich increases the current version of a cluster. (PR #2965)
fdbcliwhich lock or unlock a cluster. (PR #2890)
Protect the proxies from running out of memory when bombarded with requests from clients. (PR #2812).
One process with a
proxyclass would not become the first proxy when put with other
statelessclass processes. (PR #2819).
If a transaction log stalled on a disk operation during recruitment the cluster would become unavailable until the process died. (PR #2815).
Avoid recruiting satellite transaction logs when
usable_regions=1. (PR #2813).
Prevent the cluster from having too many active generations as a safety measure against repeated failures. (PR #2814).
fdbclistatus JSON could become truncated because of unprintable characters. (PR #2807).
The data distributor used too much CPU in large clusters (broken in 6.2.16). (PR #2806).
cluster.workload.operations.memory_errorsto measure the number of requests rejected by the proxies because the memory limit has been exceeded. (PR #2812).
cluster.workload.operations.location_requeststo measure the number of outgoing key server location responses from the proxies. (PR #2812).
cluster.recovery_state.active_generationsto track the number of generations for which the cluster still requires transaction logs. (PR #2814).
processessection to record the number of TLS policy failures each process has observed. (PR #2811).
--debug-tlsas a command line argument to
fdbclito help diagnose TLS issues. (PR #2810).
When configuring a cluster to usable_regions=2, data distribution would not react to machine failures while copying data to the remote region. (PR #2774).
When a cluster is configured with usable_regions=2, data distribution could push a cluster into saturation by relocating too many shards simulatenously. (PR #2776).
Do not allow the cluster controller to mark any process as failed within 30 seconds of startup. (PR #2780).
Backup could not establish TLS connections (broken in 6.2.16). (PR #2775).
Certificates were not refreshed automatically (broken in 6.2.16). (PR #2781).
Improved the efficiency of establishing large numbers of network connections. (PR #2777).
Add support for setting knobs to modify the behavior of
fdbcli. (PR #2773).
Setting invalid knobs in backup and DR binaries is now a warning instead of an error and will not result in the application being terminated. (PR #2773).
Restored the ability to set TLS configuration using environment variables (broken in 6.2.16). (PR #2755).
Reduced tail commit latencies by improving commit pipelining on the proxies. (PR #2589).
Data distribution does a better job balancing data when disks are more than 70% full. (PR #2722).
Reverse range reads could read too much data from disk, resulting in poor performance relative to forward range reads. (PR #2650).
Switched from LibreSSL to OpenSSL to improve the speed of establishing connections. (PR #2646).
The cluster controller does a better job avoiding multiple recoveries when first recruited. (PR #2698).
Storage servers could fail to advance their version correctly in response to empty commits. (PR #2617).
Status could not label more than 5 processes as proxies. (PR #2653).
BUGGIFY_ALL_COORDINATIONknobs could not be set at runtime. (PR #2661).
Backup container filename parsing was unnecessarily consulting the local filesystem which will error when permission is denied. (PR #2693).
Rebalancing data movement could stop doing work even though the data in the cluster was not well balanced. (PR #2703).
Data movement uses available space rather than free space when deciding how full a process is. (PR #2708).
Fetching status attempts to reuse its connection with the cluster controller. (PR #2583).
TLS throttling could block legitimate connections. (PR #2575).
Throttle TLS connect attempts from misconfigured clients. (PR #2529).
Reduced master recovery times in large clusters. (PR #2430).
Improved performance while a remote region is catching up. (PR #2527).
The data distribution algorithm does a better job preventing hot shards while recovering from machine failures. (PR #2526).
Improve the reliability of a
fdbcli. (PR #2512).
--traceclockparameter to fdbserver incorrectly had no effect. (PR #2420).
Clients could throw an internal error during
commitif client buggification was enabled. (PR #2427).
Backup and DR agent transactions which update and clean up status had an unnecessarily high conflict rate. (PR #2483).
The slow task profiler used an unsafe call to get a timestamp in its signal handler that could lead to rare crashes. (PR #2515).
Clients could hang indefinitely on reads if all storage servers holding a keyrange were removed from a cluster since the last time the client read a key in the range. (PR #2377).
In rare scenarios, status could falsely report no replicas remain of some data. (PR #2380).
Latency band tracking could fail to configure correctly after a recovery or upon process startup. (PR #2371).
backup_agentcrashed on startup. (PR #2356).
Small clusters using specific sets of process classes could cause the data distributor to be continuously killed and re-recruited. (PR #2344).
The data distributor and ratekeeper could be recruited on non-optimal processes. (PR #2344).
fdbclicould take a long time before being executed by a busy process. (PR #2339).
Committing transactions larger than 1 MB could cause the proxy to stall for up to a second. (PR #2350).
Transaction timeouts would use memory for the entire duration of the timeout, regardless of whether the transaction had been destroyed. (PR #2353).
A new transaction log spilling implementation is now the default. Write bandwidth and latency will no longer degrade during storage server or remote region failures. (PR #1731).
Storage servers will locally throttle incoming read traffic when they are falling behind. (PR #1447).
Use CRC32 checksum for SQLite pages. (PR #1582).
Added a 96-byte fast allocator, so storage queue nodes use less memory. (PR #1336).
Improved network performance when sending large packets. (PR #1684).
Spilled data can be consumed from transaction logs more quickly and with less overhead. (PR #1584).
Clients no longer talk to the cluster controller for failure monitoring information. (PR #1640).
Reduced the number of connection monitoring messages between clients and servers. (PR #1768).
Close connections which have been idle for a long period of time. (PR #1768).
Each client connects to exactly one coordinator, and at most five proxies. (PR #1909).
Ratekeeper will throttle traffic when too many storage servers are not making versions durable fast enough. (PR #1784).
Storage servers recovering a memory storage engine will abort recovery if the cluster is already healthy. (PR #1713).
Improved how the data distribution algorithm balances data across teams of storage servers. (PR #1785).
Lowered the priority for data distribution team removal, to avoid prioritizing team removal work over splitting shards. (PR #1853).
Made the storage cache eviction policy configurable, and added an LRU policy. (PR #1506).
Improved the speed of recoveries on large clusters at
log_version >= 4. (PR #1729).
Log routers will prefer to peek from satellites at
log_version >= 4. (PR #1795).
In clusters using a region configuration, clients will read from the remote region if all of the servers in the primary region are overloaded. [6.2.3] (PR #2019).
Significantly improved the rate at which the transaction logs in a remote region can pull data from the primary region. [6.2.4] (PR #2101).
Raised the data distribution priority of splitting shards because delaying splits can cause hot write shards. [6.2.6] (PR #2234).
During an upgrade, the multi-version client now persists database default options and transaction options that aren’t reset on retry (e.g. transaction timeout). In order for these options to function correctly during an upgrade, a 6.2 or later client should be used as the primary client. (PR #1767).
If a cluster is upgraded during an
onErrorcall, the cluster could return a
cluster_version_changederror. (PR #1734).
Data distribution will now pick a random destination when merging shards in the
\xffkeyspace. This avoids an issue with backup where the write-heavy mutation log shards could concentrate on a single process that has less data than everybody else. (PR #1916).
-i) for an
fdbserverprocess now sets
locality_machineidin addition to
locality_zoneid. (PR #1928).
File descriptors opened by clients and servers set close-on-exec, if available on the platform. (PR #1581).
fdbrestorecommands other than
startrequired a default cluster file to be found but did not actually use it. (PR #1912).
Unneeded network connections were not being closed because peer reference counts were handled improperly. (PR #1768).
In very rare scenarios, master recovery would restart because system metadata was loaded incorrectly. (PR #1919).
Ratekeeper will aggressively throttle when unable to fetch the list of storage servers for a considerable period of time. (PR #1858).
Proxies could become overloaded when all storage servers on a team fail. [6.2.1] (PR #1976).
Proxies could start too few transactions if they didn’t receive get read version requests frequently enough. [6.2.3] (PR #1999).
fdbclicould fail with an unknown error if the file did not contain a valid JSON object. (PR #2017).
Configuring regions would fail with an internal error if the cluster contained storage servers that didn’t set a datacenter ID. (PR #2017).
Clients no longer prefer reading from servers with the same zone ID, because it could create hot shards. [6.2.3] (PR #2019).
Data distribution could fail to start if any storage servers had misconfigured locality information. This problem could persist even after the offending storage servers were removed or fixed. [6.2.5] (PR #2110).
Data distribution was running at too high of a priority, which sometimes caused other roles on the same process to stall. [6.2.5] (PR #2170).
Loading a 6.1 or newer
fdb_clibrary as a secondary client using the multi-version client could lead to an infinite recursion when run with API versions older than 610. [6.2.5] (PR #2169)
Using C API functions that were removed in 6.1 when using API version 610 or above now results in a compilation error. [6.2.5] (PR #2169)
Coordinator changes could fail to complete if the database wasn’t allowing any transactions to start. [6.2.6] (PR #2191)
Status would report incorrect fault tolerance metrics when a remote region was configured and the primary region lost a storage replica. [6.2.6] (PR #2230)
The cluster would not change to a new set of satellite transaction logs when they become available in a better satellite location. [6.2.6] (PR #2241).
The existence of
resolverclass processes prevented
statelessclass processes from being recruited as proxies or resolvers. [6.2.6] (PR #2241).
The cluster controller could become saturated in clusters with large numbers of connected clients using TLS. [6.2.6] (PR #2252).
Backup and DR would not share a mutation stream if they were started on different versions of FoundationDB. Either backup or DR must be restarted to resolve this issue. [6.2.6] (PR #2202).
Don’t track batch priority GRV requests in latency bands. [6.2.7] (PR #2279).
Transaction log processes used twice their normal memory when switching spill types. [6.2.7] (PR #2256).
When dropping a remote region from the configuration after processes in the region have failed, data distribution would create teams from the dead servers for one minute. [6.2.7] (PR #2286).
processessection to record the fraction of time the run loop is busy. (PR #1760).
cluster.page_cachesection to status. In this section, added two new statistics
log_hit_ratethat indicate the fraction of recent page reads that were served by cache. (PR #1823).
Added transaction start counts by priority to
cluster.workload.transactions. The new counters are named
started_batch_priority. (PR #1836).
cluster.datacenter_version_differenceand replace it with
cluster.datacenter_lagthat has subfields
seconds. (PR #1800).
rolessection to record the throttling rate of the local ratekeeper (PR #1712).
fdbclistatus now reports the configured zone count. The fault tolerance is now reported in terms of the number of zones unless machine IDs are being used as zone IDs. (PR #1924).
connected_clientsis now only a sample of the connected clients, rather than a complete list. (PR #1902).
supported_versionssection, which provides a sample of connected clients which cannot connect to any higher protocol version. (PR #1902).
Clients which connect without specifying their supported versions are tracked as an
Unknownversion in the
supported_versionssection. [6.2.2] (PR #1990).
coordinatorto the list of roles that can be reported for a process. [6.2.3] (PR #2006).
cluster.qossection, each with subfields
seconds. These report the durability lag values being used by ratekeeper to potentially limit the transaction rate. [6.2.3] (PR #2003).
cluster.qossection, each with subfields
seconds. These are meant to replace
limiting_version_lag_storage_server, which are now deprecated. [6.2.3] (PR #2003).
cluster.datasection to record the size of the system keyspace. [6.2.5] (PR #2170).
API version updated to 620. See the API version upgrade guide for upgrade details.
Add a transaction size limit as both a database option and a transaction option. (PR #1725).
Added a new API to get the approximated transaction size before commit, e.g.,
fdb_transaction_get_approximate_sizein the C binding. (PR #1756).
fdb_future_get_versionhas been renamed to
fdb_future_get_int64. (PR #1756).
C: Applications linking to
libfdb_ccan now use
find_package(FoundationDB-Client ...)(for cmake) to get the proper flags for compiling and linking. (PR #1636).
Go: The Go bindings now require Go version 1.11 or later.
Go: Finalizers could run too early leading to undefined behavior. (PR #1451).
Added a transaction option to control the field length of keys and values in debug transaction logging in order to avoid truncation. (PR #1844).
Added a transaction option to control the whether
get_addresses_for_keyincludes a port in the address. This will be deprecated in api version 630, and addresses will include ports by default. [6.2.4] (PR #2060).
Versionstampcomparisons didn’t work in Python 3. [6.2.4] (PR #2089).
Added the primitives for FDB backups based on disk snapshots. This provides an ability to take a cluster level backup based on disk level snapshots of the storage, tlogs and coordinators. (PR #1733).
Foundationdb now uses the flatbuffers serialization format for all network messages. (PR 1090).
Clients will throw
transaction_too_oldwhen attempting to read if
setVersionwas called with a version smaller than the smallest read version obtained from the cluster. This is a protection against reading from the wrong cluster in multi-cluster scenarios. (PR #1413).
Trace files are now ordered lexicographically. This means that the filename format for trace files has changed. (PR #1828).
TransactionMetricslog events by adding a random UID to distinguish multiple open connections, a flag to identify internal vs. client connections, and logging of rates and roughness in addition to total count for several metrics. (PR #1808).
FoundationDB can now be built with clang and libc++ on Linux. (PR #1666).
Added experimental framework to run C and Java clients in simulator. (PR #1678).
Added new network options for client buggify which will randomly throw expected exceptions in the client. This is intended to be used for client testing. (PR #1417).
fdbserverprocesses to control the amount of memory dedicated to caching pages read from disk. (PR #1889).
MakoWorkload, used as a benchmark to do performance testing of FDB. (PR #1586).
fdbservernow accepts a comma separated list of public and listen addresses. (PR #1721).
CAUSAL_READ_RISKYhas been enhanced to further reduce the chance of causally inconsistent reads. Existing users of
CAUSAL_READ_RISKYmay see increased GRV latency if proxies are distantly located from logs. (PR #1841).
CAUSAL_READ_RISKYcan be turned on for all transactions using a database option. (PR #1841).
no_waitoption to the
fdbcliexclude command to avoid blocking. (PR #1852).
Idle clusters will fsync much less frequently. (PR #1697).
CMake is now the official build system. The Makefile based build system is deprecated.
The incompatible client list in status (
cluster.incompatible_connections) may now spuriously include clients that use the multi-version API to try connecting to the cluster at multiple versions.
Fixes only impacting 6.2.0+
Clients could crash when closing connections with incompatible servers. [6.2.1] (PR #1976).
Do not close idle network connections with incompatible servers. [6.2.1] (PR #1976).
max_protocol_clientswere incorrectly added to the
connected_clientslist. [6.2.2] (PR #1990).
Ratekeeper ignores the (default 5 second) MVCC window when controlling on durability lag. [6.2.3] (PR #2012).
The macOS client was not compatible with a Linux server. [6.2.3] (PR #2045).
Incompatible clients would continually reconnect with coordinators. [6.2.3] (PR #2048).
Connections were being closed as idle when there were still unreliable requests waiting for a response. [6.2.3] (PR #2048).
The cluster controller would saturate its CPU for a few seconds when sending configuration information to all of the worker processes. [6.2.4] (PR #2086).
The data distributor would build all possible team combinations if it was tracking an unhealthy server with less than 10 teams. [6.2.4] (PR #2099).
The cluster controller could crash if a coordinator was unreachable when compiling cluster status. [6.2.4] (PR #2065).
A storage server could crash if it took longer than 10 minutes to fetch a key range from another server. [6.2.5] (PR #2170).
Excluding or including servers would restart the data distributor. [6.2.5] (PR #2170).
The data distributor could read invalid memory when estimating database size. [6.2.6] (PR #2225).
Status could incorrectly report that backup and DR were not sharing a mutation stream. [6.2.7] (PR #2274).