Training statistics#

class pfl.stats.TensorLike(*args, **kwargs)#

A protocol for tensor-like objects, e.g. NumPy, TensorFlow and Torch tensors.

class pfl.stats.TrainingStatistics#

Base class for statistics that can be used for training a model. Statistics from different parts of the data can be combined with “+”.

The values can have different types depending on the subclass.

Statistics can be converted to vector space, summed, and then converted back, which should be equivalent to summing the actual statistics.

Example:
metadata1, private_vectors1 = stats1.get_weights()
metadata2, private_vectors2 = stats2.get_weights()
metadata_sum = metadata1 + metadata2
private_vectors_sum = [
    v1 + v2 for v1, v2 in zip(private_vectors1,
                              private_vectors2)
]
stats_sum = stats1.from_weights(metadata_sum,
                                private_vectors_sum)
assert stats_sum == (stats1 + stats2)
abstract property num_parameters: int#

Get the total number of parameters for all statistics in this object.

abstract get_weights()#

Get a vector representation of this statistic, with dtype=float32. Usually for the purpose of sending it over the wire. The vector representation returned may be divided up into smaller vector/matrices to more easily be manipulated.

Summing vectors returned by get_weights from two different statistics objects and thereafter converting back using the method from_weights must be equivalent to summing the two original objects.

Return type:

Tuple[TypeVar(Tensor, bound= TensorLike), List[TypeVar(Tensor, bound= TensorLike)]]

Returns:

A tuple (metadata, weights). metadata is a vector of data which is not privacy-sensitive, e.g. the weight. weights is a list of matrices with data that is considered privacy-sensitive and is additive with noise. The matrices may be weights for different layers in the context of neural networks.

abstract from_weights(metadata, weights)#

Create a new statistics of this class from a vector representation. The input of this method is the same format as returned by get_weights.

Note that this is a method on an object of this class, since it is possible that runtime attributes that do not change with addition are not serialized.

Parameters:
  • metadata (TypeVar(Tensor, bound= TensorLike)) – A vector of data which is not privacy-sensitive. The contents depend on the implementation. Can include e.g. the weight.

  • weights (List[TypeVar(Tensor, bound= TensorLike)]) – A list of matrices with data that is considered privacy-sensitive and is additive with noise. The matrices may be weights for different layers in the context of neural networks.

Return type:

TypeVar(StatisticsType, bound= TrainingStatistics)

apply(fn, *args, **kwargs)#

Apply a function on the weights from get_weights, and put result back into a new statistics with same metadata.

Example:
stats_p1 = stats.apply(lambda weights: [w+1 for w in weights])
Parameters:
  • fn (Callable[..., Iterable[ndarray]]) – A callable (weights, *args, **kwargs) -> weights, where weights is a list of tensors and args and kwargs are any additional arguments.

  • args – Additional arguments to fn when applying it.

  • kwargs – Additional keyword arguments to fn when applying it.

Return type:

TypeVar(StatisticsType, bound= TrainingStatistics)

Returns:

A statistics, with its data transformed by fn.

apply_elementwise(fn, *args, **kwargs)#

Apply function on each weight from get_weights individually, and put result back into a new statistics with same metadata.

Example:
# Equivalent to the example in `apply`.
stats_p1 = stats.apply_elementwise(lambda w: w+1)
Parameters:
  • fn (Callable[..., ndarray]) – A callable (weight, *args, **kwargs) -> weight, where weight is a tensor from the statistic’s weights and args and kwargs are any additional arguments.

  • args – Additional arguments to fn when applying it.

  • kwargs – Additional keyword arguments to fn when applying it.

Return type:

TypeVar(StatisticsType, bound= TrainingStatistics)

Returns:

A statistics, with its data transformed by fn.

class pfl.stats.WeightedStatistics(weight)#

Statistics for training a model that can be weighted and summed. The weight will generally be the number of samples or the number of clients that the statistics are over. Using the method average produces a weighted average of the summed statistics.

The statistics can be re-weighted by the weight property.

In mathematical terms, the statistics are in a vector space; and with the weights, they are in an expectation semiring.

Parameters:

weight (float) – The weight of the statistics object.

property weight: float#

Get the weight of this object.

abstract reweight(new_weight)#

Reweight the statistics by dividing the values by the current weight and multiplying them by new_weight. This means that the ratio statistics/weight remains the same but the weight changes to new_weight.

average()#

Divide (in-place) each individual statistic by its weight. The new weight will be 1.

Return type:

None

class pfl.stats.MappedVectorStatistics(name_to_stats=None, weight=1.0)#

Statistics consisting of a number of tensors keyed by strings. Commonly used to represent neural network model updates. When adding two Statistics, the tensors for each key are added together and the weights are added as well.

Parameters:
  • name_to_stats (Optional[Dict[str, TypeVar(Tensor, bound= TensorLike)]]) – A dictionary, mapping identifiers of individual statistics to the tensors.

  • weight (float) – The weight of the statistics. Does not have any effect on the raw data directly.

property num_parameters: int#

Get the total number of parameters for all statistics in this object.

reweight(new_weight)#

Reweight the statistics by dividing the values by the current weight and multiplying them by new_weight. This means that the ratio statistics/weight remains the same but the weight changes to new_weight.

get_weights()#

Get a vector representation of this statistic, with dtype=float32. Usually for the purpose of sending it over the wire. The vector representation returned may be divided up into smaller vector/matrices to more easily be manipulated.

Summing vectors returned by get_weights from two different statistics objects and thereafter converting back using the method from_weights must be equivalent to summing the two original objects.

Return type:

Tuple[TypeVar(Tensor, bound= TensorLike), List[TypeVar(Tensor, bound= TensorLike)]]

Returns:

A tuple (metadata, weights). metadata is a vector of data which is not privacy-sensitive, e.g. the weight. weights is a list of matrices with data that is considered privacy-sensitive and is additive with noise. The matrices may be weights for different layers in the context of neural networks.

from_weights(metadata, weights)#

Create a new statistics of this class from a vector representation. The input of this method is the same format as returned by get_weights.

Note that this is a method on an object of this class, since it is possible that runtime attributes that do not change with addition are not serialized.

Parameters:
  • metadata (TypeVar(Tensor, bound= TensorLike)) – A vector of data which is not privacy-sensitive. The contents depend on the implementation. Can include e.g. the weight.

  • weights (List[TypeVar(Tensor, bound= TensorLike)]) – A list of matrices with data that is considered privacy-sensitive and is additive with noise. The matrices may be weights for different layers in the context of neural networks.

Return type:

TypeVar(MappedVectorStatisticsType, bound= MappedVectorStatistics)

class pfl.stats.ElementWeightedMappedVectorStatistics(name_to_stats=None, weights=None)#

Statistics consisting of a number of tensors keyed by strings with weights as a number of tensors keyed by the same set of strings. Each element in the statistics has a weight tensor with the same shape and the same key as the element. Commonly used to represent neural network model updates.

Parameters:
  • name_to_stats (Optional[Dict[str, TypeVar(Tensor, bound= TensorLike)]]) – A dictionary, mapping identifiers of individual statistics to the tensors.

  • weights (Optional[Dict[str, TypeVar(Tensor, bound= TensorLike)]]) – The dictionary with weights of the statistics. Does not have any effect on the raw data directly. Adding two Statistics will add their weights as well.

property weight: float#

Get the weight of this object.

reweight(new_weight)#

Reweight the statistics by dividing the values by the current weight and multiplying them by new_weight. This means that the ratio statistics/weight remains the same but the weight changes to new_weight.

get_weights()#

Get a vector representation of this statistic, with dtype=float32. Usually for the purpose of sending it over the wire. The vector representation returned may be divided up into smaller vector/matrices to more easily be manipulated.

Summing vectors returned by get_weights from two different statistics objects and thereafter converting back using the method from_weights must be equivalent to summing the two original objects.

Return type:

Tuple[TypeVar(Tensor, bound= TensorLike), List[TypeVar(Tensor, bound= TensorLike)]]

Returns:

A tuple (metadata, weights). metadata is a vector of data which is not privacy-sensitive, e.g. the weight. weights is a list of matrices with data that is considered privacy-sensitive and is additive with noise. The matrices may be weights for different layers in the context of neural networks.

from_weights(metadata, statistics_list)#

Create a new statistics of this class from a vector representation. The input of this method is the same format as returned by get_weights.

Note that this is a method on an object of this class, since it is possible that runtime attributes that do not change with addition are not serialized.

Parameters:
  • metadata (TypeVar(Tensor, bound= TensorLike)) – A vector of data which is not privacy-sensitive. The contents depend on the implementation. Can include e.g. the weight.

  • weights – A list of matrices with data that is considered privacy-sensitive and is additive with noise. The matrices may be weights for different layers in the context of neural networks.

Return type:

ElementWeightedMappedVectorStatistics

average()#

Divide (in-place) each individual statistic by its weight. The new weight will be 1.

Return type:

None