Python Java


Create an multimap data structure with multiset values.


Support efficient operations on multimaps, including random addition, removal, and retrieval of indexed values.


Multimaps are a generalization of dictionaries (a.k.a. maps or associative arrays) in which each index can have multiple values. Multimaps can be further generalized by allowing a value to be present more than once for a given index, so that each index is associated with a multiset. These structures have a simple and efficient representation as FoundationDB’s key-value pairs.


We store all values of a given index using adjacent key-value pairs. This allows all values of an index to be retrieved with a single range read.


We store all values in the multimap within a subspace, which takes care of packing our keys into byte strings.

multi = fdb.Subspace(('M',))

Because we need to store multiple values per index, we’ll store them within keys, with each (index, value) pair in its own key. To implement the multiset we’ll record the number of occurrences of each value. This is done by storing a positive integer with the key using an atomic addition. Each addition of a given value for an index will increment the count by 1:

tr.add(multi[index][value], struct.pack('<q', 1))

By using a read-free atomic addition, FoundationDB guarantees that the addition operation will not conflict. As a result, values can be frequently added by multiple clients.

Subtracting values, on the other hand, requires a read to ensure that the value count does not fall below 0. (Hence, unlike additions, subtractions will be subject to conflicts.) We’ll just delete the key if a subtraction reduces the count to 0 in order to keep the representation sparse.


Negative value counts

We can generalize the representation further by allowing the count to be an arbitrary integer rather than restricting it to a positive integer. This extension may be useful for applications that record a deficit or debt of some resource. In this case, the code becomes even simpler and more efficient: we can simply remove the read and test from subtraction, making it conflict-free like the addition operation.


Here’s a simple implementation of multimaps with multisets as described:

import struct

multi = fdb.Subspace(('M',))

# Multimaps with multiset values
def multi_add(tr, index, value):
    tr.add(multi[index][value], struct.pack('<q', 1))

def multi_subtract(tr, index, value):
    v = tr[multi[index][value]]
    if v.present() and struct.unpack('<q', str(v))[0] > 1:
        tr.add(multi[index][value], struct.pack('<q', -1))
        del tr[multi[index][value]]

def multi_get(tr, index):
    return [multi.unpack(k)[1] for k, v in tr[multi[index].range()]]

def multi_get_counts(tr, index):
    return {multi.unpack(k)[1]:struct.unpack('<q', v)[0]
            for k, v in tr[multi[index].range()]}

def multi_is_element(tr, index, value):
    return tr[multi[index][value]].present()