Utilities

coremltools.optimize.coreml.get_weights_metadata(*args, **kwargs)[source]

Utility function to get the weights metadata as a dictionary, which maps the weight’s name to its corresponding CoreMLWeightMetaData.

CoreMLWeightMetaData contains the following attributes:

  1. val: The weight data.

  2. sparsity: the percentile of the element whose absolute value <= 1e-12.

  3. unique_values: number of unique values in the weight.

  4. child_ops: meta information of the child ops in which the weight is feeding into.

Parameters:
mlmodel: MLModel

Model in which the weight metadata is retrieved from.

weight_threshold: int
  • The size threshold, above which weights are returned. That is, a weight tensor is included in the resulting dictionary only if its total number of elements are greater than weight_threshold. For example, if weight_threshold = 1024 and a weight tensor is of shape [10, 20, 1, 1], hence 200 elements, it will not be returned by the get_weights_metadata API.

  • If not provided, it will be set to 2048, in which weights bigger than 2048 elements are returned.

Returns:
dict[str, CoreMLWeightMetaData]

A dict that maps weight’s name to its metadata.

Examples

In this example, there are two weights whose sizes are greater than 2048. A weight named conv_1_weight is feeding into a conv op named conv_1, while another weight named linear_1_weight is feeding into a linear op named linear_1. You can access the metadata by weight_metadata_dict["conv_1_weight"], and so on.

import coremltools as ct

mlmodel = ct.models.MLModel("my_model.mlpackage")
weight_metadata_dict = ct.optimize.coreml.get_weights_metadata(
    mlmodel, weight_threshold=2048
)

# get the weight names with size > 25600
large_weights = []
for k, v in weight_metadata_dict.items():
    if v.val.size >= 25600:
        large_weights.append(k)

# get the weight names with sparsity >= 50%
sparse_weights = []
for k, v in weight_metadata_dict.items():
    if v.sparsity >= 0.5:
        sparse_weights.append(k)

# get the weight names with unique elements <= 16
palettized_weights = []
for k, v in weight_metadata_dict.items():
    if v.unique_values <= 16:
        palettized_weights.append(k)

# print out the dictionary
print(weight_metadata_dict)

The output from the above example would be:

conv_1_weight
[
    val: np.ndarray(shape=(32, 64, 2, 2), dtype=float32)
    sparsity: 0.5
    unique_values: 4097
    child_ops: [
        conv(name=conv_1, weight=conv_1_weight, ...)
    ]
]
linear_1_weight
[
    val: np.ndarray(shape=(128, 64), dtype=float32)
    sparsity: 0.2501220703125
    unique_values: 4
    child_ops: [
        linear(name=linear_1, weight=linear_1_weight, ...)
    ]
]
coremltools.optimize.coreml.decompress_weights(*args, **kwargs)[source]

Utility function to convert weights that are sparse or palettized or affine quantized, back to the float format. That is, convert any of the following three ops to mb.const:

  1. constexpr_affine_dequantize

  2. constexpr_lut_to_dense

  3. constexpr_sparse_to_dense

Parameters:
mlmodel: MLModel

Model which will be decompressed.

Returns:
model: MLModel

The MLModel with no constexpr ops included.

Examples

import coremltools as ct

model = ct.models.MLModel("my_compressed_model.mlpackage")
decompressed_model = ct.optimize.coreml.decompress_weights(model)
class coremltools.optimize.coreml.CoreMLWeightMetaData(val: ndarray, sparsity: float | None = NOTHING, unique_values: int | None = NOTHING, child_ops: List[CoreMLOpMetaData] | None = None)[source]

A container class that stores weight meta data.

The class has the following attributes:

Parameters:
val: numpy.ndarray

The weight data.

sparsity: float

The percentile of the element whose absolute value <= 1e-12.

unique_values: int

Number of unique values in the weight.

child_ops: list[CoreMLOpMetaData]

A list of CoreMLOpMetaData which contains information of child ops in which the weight is feeding into.

The attributes can be accessed by: child_ops[idx].op_type: The operation type of the idx ‘th child op. child_ops[idx].name: The name of the idx ‘th child op.

Other op-dependant attributes also can be accessed. For instance, if idx ‘th child op is a conv layer, child_ops[idx].weight will return its weight name.

For more details, please refer to the CoreMLOpMetaData doc string.

Examples

import numpy as np
from coremltools.optimize.coreml import CoreMLWeightMetaData

data = np.array([[1.0, 0.0], [0.0, 6.0]], dtype=np.float32)
meta_data = CoreMLWeightMetaData(data)
print(meta_data)

Outputs:

[
    val: np.ndarray(shape=(2, 2), dtype=float32)
    sparsity: 0.5
    unique_values: 3
]
class coremltools.optimize.coreml.CoreMLOpMetaData(op_type: str, name: str, params_name_mapping: Dict[str, str])[source]

A container class that stores op meta data.

The class has the following attributes:

Parameters:
op_type: str

The type of the op. For instance: conv, linear, and so on.

name: str

The name of the op.

params_name_mapping: dict[str, str]

A dict that maps the op’s constant parameters to its corresponding weight name. For instance, given a conv op with params_name_mapping,

{
    "weight": "conv_1_weight",
    "bias": "conv_1_bias",
}

means that the weight and bias of this op are named conv_1_weight, conv_1_bias, respectively.

class coremltools.optimize.coreml.OptimizationConfig(global_config: OpCompressorConfig | None = None, op_type_configs: OpCompressorConfig | None = None, op_name_configs: OpCompressorConfig | None = None, is_deprecated: bool = False, op_selector: Callable | None = None)[source]

A configuration wrapper that enables fine-grained control when compressing a model, Providing the following levels: global, op type, and op name.

  1. global_config: The default configuration applied to all ops / consts.

  2. op_type_configs: Configurations applied to specific op type. It overrides global_config.

  3. op_name_configs: Configurations applied to specific constant or op instance. It overrides global_config and op_type_configs.

The following is an example that constructs an optimization config for weight palettization.

from coremltools.optimize.coreml import OpPalettizerConfig, OptimizationConfig

# The default global configuration is 8 bits palettization with kmeans
global_config = OpPalettizerConfig(mode="kmeans", nbits=8)

# We use 2 bits palettization for convolution layers, and skip the compression for linear layers
op_type_configs = {
    "conv": OpPalettizerConfig(mode="kmeans", nbits=2),
    "linear": None,
}

# We want a convolution layer named "conv_1" to have a 4 bits palettization with a different mode
op_name_configs = {
    "conv_1": OpPalettizerConfig(mode="uniform", nbits=4),
}

# Now we can put all configuration across three levels to construct an OptimizationConfig object
config = OptimizationConfig(
    global_config=global_config,
    op_type_configs=op_type_configs,
    op_name_configs=op_name_configs,
)
Parameters:
global_config: OpCompressorConfig

Config to be applied globally to all supported ops.

op_type_configs: dict[str, OpCompressorConfig]

Op type level configs applied to a specific op class.

  • The keys of the dictionary are the string of the op type, and the values are the corresponding OpCompressorConfig.

  • An op type will not be compressed if the value is set to None.

op_name_configs: dict[str, OpCompressorConfig]

Op instance level configs applied to a specific constant or op.

  • The keys of the dictionary are the name of a constant or an op instance, and the values are the corresponding OpCompressorConfig.

  • An op instance will not be compressed if the value is set to None.

  • You can use coremltools.optimize.coreml.get_weights_metadata to get the name of the constants / op instances in the model.

classmethod from_dict(config_dict: Dict[str, Any]) OptimizationConfig[source]

Construct an OptimizationConfig instance from a nested dictionary. The dictionary should have the structure that only contains (if any) the following four str keys:

  • "config_type": Specify the configuration class type.

  • "global_config": Parameters for global_config.

  • "op_type_configs": A nested dictionary for op_type_configs.

  • "op_name_config": A nested dictionary for op_name_configs.

The following is a nested dictionary that creates an optimization config for weight palettization:

config_dict = {
    "config_type": "OpPalettizerConfig",
    "global_config": {
        "mode": "kmeans",
        "nbits": 4,
    },
    "op_type_configs": {
        "conv": {
            "mode": "uniform",
            "nbits": 1,
        }
    },
    "op_name_configs": {
        "conv_1": {
            "mode": "unique",
        }
    },
}

Note that you can override the config_type. For instance, if you want to do threshold-based pruning to the model in addition to the convolution layers in which magnitude pruning is applied, the following is an example of the nested dictionary:

config_dict = {
    "config_type": "OpThresholdPrunerConfig",
    "global_config": {
        "threshold": 0.01,
    },
    "op_type_configs": {
        "conv": {
            "config_type": "OpMagnitudePrunerConfig",
            "n_m_ratio": [3, 4],
        }
    },
}
Parameters:
config_dict: dict[str, Any]

A dictionary that represents the configuration structure.

classmethod from_yaml(yml: IO | str) OptimizationConfig[source]

Construct an OptimizationConfig instance from a YAML file. The YAML file should have the structure that only contains (if any) the following four str keys:

  • "config_type": Specify the configuration class type.

  • "global_config": Parameters for global_config.

  • "op_type_configs": A nested dictionary for op_type_configs.

  • "op_name_config": A nested dictionary for op_name_configs.

The following is a YAML file that creates an optimization config for weight palettization:

config_type: OpPalettizerConfig
global_config:
    mode: kmeans
    nbits: 4
op_type_configs:
    conv:
        mode: uniform
        nbits: 1
op_name_configs:
    conv_1:
        mode: unique

Note that you can override the config_type. For instance, if you want to do threshold-based pruning to the model in addition to the convolution layers in which magnitude pruning is applied, the following is an example of the YAML file:

config_type: OpThresholdPrunerConfig
global_config:
    threshold: 0.01
op_type_configs:
    conv:
        config_type: OpMagnitudePrunerConfig
        n_m_ratio: [3, 4]
Parameters:
yml: str, IO

A YAML file or the path to the file.

set_global(op_config: OpCompressorConfig)[source]

Sets the global config that would be applied to all constant ops.

from coremltools.optimize.coreml import OpPalettizerConfig, OptimizationConfig

config = OptimizationConfig()
global_config = OpPalettizerConfig(mode="kmeans", nbits=8)
config.set_global(global_config)
Parameters:
op_config: OpCompressorConfig

Config to be applied globally to all supported ops.

set_op_name(op_name: str, op_config: OpCompressorConfig)[source]

Sets the compression config at the level of constant / op instance by name.

from coremltools.optimize.coreml import OpPalettizerConfig, OptimizationConfig

config = OptimizationConfig()
op_config = OpPalettizerConfig(mode="kmeans", nbits=2)
config.set_op_name("conv_1", op_config)

Note that, in order to get the name of a constant or an op instance, please refer to the coremltools.optimize.coreml.get_weights_metadata API.

Parameters:
op_name: str

The name of a constant or an op instance.

op_config: OpCompressorConfig

Op instance level config applied to a specific constant or op with name op_name.

set_op_type(op_type: str, op_config: OpCompressorConfig)[source]

Sets the compression config at the level of op type.

from coremltools.optimize.coreml import OpPalettizerConfig, OptimizationConfig

config = OptimizationConfig()
conv_config = OpPalettizerConfig(mode="kmeans", nbits=2)
config.set_op_type("conv", conv_config)
Parameters:
op_type: str

The type of an op. For instance, "conv", "linear".

op_config: OpCompressorConfig

Op type level config applied to a specific op class op_type.