coreai_opt.quantization.spec.PerBlockGranularity¶

class coreai_opt.quantization.spec.PerBlockGranularity[source]¶

Bases: QuantizationGranularity

Per-block quantization granularity.

This applies quantization to blocks of values within the tensor. Supports two modes:

Single-axis mode: Quantize blocks along one specific axis (no blocking for axis>=2)
- axis: The axis to create blocks (0 or 1)
- block_size: Integer specifying block size for that axis
Multi-axis mode: Create blocks across multiple axes simultaneously
- axis: Must be None
- block_size: Tuple specifying block size for each axis (-1 means no blocking)

In single-axis mode, when axis is None and block_size is an integer, Quantizer.prepare() automatically resolves the axis based on the module type for weight quantization.

Weight tensor shape (input)	axis	block_size	Weight shape of each block (output)
[C_out, C_in]	1	32	[1, 32]
[C_out, C_in]	None	(4, 8)	[4, 8]
[C_out, C_in, KH, KW]	0	16	[16, 1, KH, KW]
[C_out, C_in, KH, KW]	None	(4, 16, 3, -1)	[4, 16, 3, KW]