Data Introspection#

Data introspectors observe intermediate model responses, and process data in batches when calling .introspect().

Dataset Report#

The DatasetReport bundles Familiarity, Duplicates and Dimension Reduction introspectors (below) in an interactive interface with various visualization options.

Familiarity#

Familiarity quantifies how familiar a data point is to a specific dataset or subset, by fitting a probability distribution to the activations of the specified layer(s), and then evaluating the probability of any data sample according to the distribution.

Duplicates#

Find near-duplicate data. Uses an approximate nearest neighbor to build a distance matrix for all samples and clusters the closest samples.

Dimension Reduction#

Projects high dimensional activation data to a lower dimension, usually for consumption by a different introspector or for 2D or 3D visualization.