coreai_opt.quantization.spec.PerBlockGranularity¶
- class coreai_opt.quantization.spec.PerBlockGranularity[source]¶
Bases:
QuantizationGranularityPer-block quantization granularity.
This applies quantization to blocks of values within the tensor. Supports two modes:
Single-axis mode: Quantize blocks along one specific axis (no blocking for axis>=2)
axis: The axis to create blocks (0 or 1)block_size: Integer specifying block size for that axis
Multi-axis mode: Create blocks across multiple axes simultaneously
axis: Must be Noneblock_size: Tuple specifying block size for each axis (-1 means no blocking)
In single-axis mode, when
axisisNoneandblock_sizeis an integer,Quantizer.prepare()automatically resolves the axis based on the module type for weight quantization.Weight tensor shape (input)
axis
block_size
Weight shape of each block (output)
[C_out, C_in]
1
32
[1, 32]
[C_out, C_in]
None
(4, 8)
[4, 8]
[C_out, C_in, KH, KW]
0
16
[16, 1, KH, KW]
[C_out, C_in, KH, KW]
None
(4, 16, 3, -1)
[4, 16, 3, KW]