coreai_opt.palettization.ModuleKMeansPalettizerConfig¶
- class coreai_opt.palettization.ModuleKMeansPalettizerConfig[source]¶
Bases:
WeightOnlyModuleValidationMixin,ModuleCompressionConfig[OpKMeansPalettizerConfig, PalettizationSpec]Configuration for palettizing a specific module using K-means clustering.
This class manages palettization settings for an entire module, including:
Operation-level configurations (default, by type, by name)
Module-level state (parameter) palettization
The operation configurations follow a hierarchical precedence:
op_name_config (most specific - applies to operations matching a name pattern)
op_type_config (applies to operations of a specific type)
op_state_spec (least specific - applies to all operations not otherwise configured)
Module-level state settings treat the module as an opaque entity, setting palettization settings for specified tensors and ignoring op specific palettization capabilities. Module-level settings also don’t check whether the operation receiving the palettized tensor is a registered operation or not. Module-level settings will override any op specific settings.
- op_state_spec¶
Palettization specifications for operation state tensors (parameters, buffers, constants) applied to all registered operations/patterns within this module that don’t have a more specific configuration. Keys can be string names (e.g. “weight”, “bias”) or “*” to refer to all state inputs. Values are PalettizationSpec objects or None defining how to palettize each state tensor. None value represents disabling palettization. Default: 4-bit palettization for “weight” and “in_proj_weight” state tensors via
default_weight_palettization_spec().- Type:
dict[str, PalettizationSpec | None] | None
- op_type_config¶
Operation type-specific configurations. Keys are operation type names (e.g., “aten.linear.default”, “aten.conv2d.default”). Values are OpKMeansPalettizerConfig objects or None, defining how to palettize operations of that type. None value represents disabling palettization. Default: {} (empty dict, no type-specific configs)
- Type:
dict[str, OpKMeansPalettizerConfig | None] | None
- op_name_config¶
Operation name-specific configurations. Keys are operation name patterns (supports regex matching). Values are OpKMeansPalettizerConfig objects or None, defining how to palettize operations matching those names. None value represents disabling palettization. Default: {} (empty dict, no name-specific configs)
- Type:
dict[str, OpKMeansPalettizerConfig | None] | None
- module_state_spec¶
Palettization specifications for module state tensors (parameters, buffers, and constants). Module state settings will override op state settings for the same state tensors. Keys can be string names (e.g. “weight”, “bias”) or “*” to refer to all state inputs. Values are PalettizationSpec objects or None. None value represents disabling palettization. Default: {} (empty dict, no specific module state settings)
- Type:
dict[str, PalettizationSpec | None] | None
- enable_fast_kmeans_mode¶
When True, enables optimizations for faster K-means clustering by rounding the weights before clustering if data is in float16 range. If weight dtype is float32, weights are cast to float16 and then rounded. This is not supported with
cluster_dim > 1. Default: True.- Type:
bool
- rounding_precision¶
Number of decimal places to round to during fast K-means clustering. Higher values preserve more precision but may reduce speed benefits. Only used when enable_fast_kmeans_mode is True. Default: 4.
- Type:
int
Example
>>> config = ModuleKMeansPalettizerConfig() # Uses defaults >>> # Or with custom settings: >>> from coreai_opt.palettization.spec import PalettizationSpec >>> config = ModuleKMeansPalettizerConfig( ... op_state_spec={"weight": PalettizationSpec(n_bits=2)}, ... enable_fast_kmeans_mode=False, # Disable for maximum precision ... rounding_precision=6 # Higher precision when fast mode is enabled ... )