Neural Networks

A neural network is defined through a collection of layers and represents a directed acyclic graph (DAG). Each layer has a name, a layer type, a list of input names, a list of output names, and a collection of parameters specific to the layer type.

The graph structure and connectivity of the neural network is inferred from the input and output names. A neural network starts with the layer whose input name is equal to the value specified in Model.description.input.name, and ends with the layer whose output name is equal to the value specified in Model.description.output.name. Layers must have unique input and output names, and a layer may not have input or output names that refer to layers that are not yet defined.

CoreML supports sequential data that can be 1- or 3-dimensional. 3-dimensional data typically represents an image feature map, whose shape is denoted by [C, H, W], which corresponds to the channel, height, and width, respectively. 1-dimensional data is a set of features whose shape is denoted by [C], and is equivalent to 3-dimensional data with the shape [C, 1, 1].

For the purposes of this specification, batch dimension is ignored. Thus, a sequence of 3-dimensional data is to be understood as a 4-dimensional array, whose shape is denoted by [Seq_length, C, H, W], and a sequence of 1-dimensional data is to be understood as a 2-dimensional array, whose shape is denoted by [Seq_length, C], which is equivalent to a 4-dimensional array with the shape [Seq_length, C, 1, 1]. This axes order is important to remember while setting parameters for layers such as “reshape” and “permute”.

At runtime, all data blobs are internally represented as 5-dimensional blobs with the shape [Seq_length, Batch, C, H, W].

A layer may process input data differently if operating over a sequence; details of this behavior is documented in the layer’s message. Otherwise, sequential data is processed like a batch — that is, the sequence of inputs are processed independently and in parallel.

The network input shape specified by Model.description.input.type must be compatible with the expected input shape of the network input layer, i.e. the last dimension is the fastest moving one.

All data blobs, as well as weight parameters, are stored using row-major ordering, i.e. the last dimension is the fastest moving one.

NeuralNetwork

A neural network.

message NeuralNetwork {
    repeated NeuralNetworkLayer layers = 1;
    repeated NeuralNetworkPreprocessing preprocessing = 2;
}

NeuralNetworkImageScaler

A neural network preprocessor that performs a scalar multiplication of an image followed by addition of scalar biases to the channels.

Input: X
An image in BGR or RGB format with shape [3, H, W] or in grayscale format with shape [1, H, W].
Output: Y
An image with format and shape corresponding to the input.

If the input image is in BGR format:

Y[0, :, :] = channelScale * X[0, :, :] + blueBias
Y[1, :, :] = channelScale * X[1, :, :] + greenBias
Y[2, :, :] = channelScale * X[2, :, :] + redBias

If the input image is in RGB format:

Y[0, :, :] = channelScale * X[0, :, :] + redBias
Y[1, :, :] = channelScale * X[1, :, :] + greenBias
Y[2, :, :] = channelScale * X[2, :, :] + blueBias

If the input image is in grayscale format:

Y[0, :, :] = channelScale * X[0, :, :] + grayBias
message NeuralNetworkImageScaler {
    float channelScale = 10;
    float blueBias = 20;
    float greenBias = 21;
    float redBias = 22;
    float grayBias = 30;
}

NeuralNetworkMeanImage

A neural network preprocessor that subtracts the provided mean image from the input image. The mean image is subtracted from the input named NeuralNetworkPreprocessing.featureName.

message NeuralNetworkMeanImage {
    repeated float meanImage = 1;
}

NeuralNetworkPreprocessing

Preprocessing parameters for image inputs.

message NeuralNetworkPreprocessing {
    string featureName = 1;
    oneof preprocessor {
      NeuralNetworkImageScaler scaler = 10;
      NeuralNetworkMeanImage meanImage = 11;
    }
}

ActivationReLU

A rectified linear unit (ReLU) activation function.

This function has the following formula:

f(x) = \text{max}(0, x)

message ActivationReLU {
}

ActivationLeakyReLU

A leaky rectified linear unit (ReLU) activation function.

This function has the following formula:

f(x) = \begin{cases}
        x      & \text{if } x \geq 0 \\
        \alpha x & \text{if } x < 0
       \end{cases}

message ActivationLeakyReLU {
    float alpha = 1; //negative slope value for leakyReLU
}

ActivationTanh

A hyperbolic tangent activation function.

This function has the following formula:

f(x) = \dfrac{1 - e^{-2x}}{1 + e^{-2x}}

message ActivationTanh {
}

ActivationScaledTanh

A scaled hyperbolic tangent activation function.

This function has the following formula:

f(x) = \alpha \tanh(\beta x)

message ActivationScaledTanh {
    float alpha = 1;
    float beta = 2;
}

ActivationSigmoid

A sigmoid activation function.

This function has the following formula:

f(x) = \dfrac{1}{1 + e^{-x}}

message ActivationSigmoid {
}

ActivationLinear

A linear activation function.

This function has the following formula:

f(x) = \alpha x + \beta

message ActivationLinear {
    float alpha = 1;
    float beta = 2;
}

ActivationSigmoidHard

A hard sigmoid activation function.

This function has the following formula:

f(x) = \text{min}(\text{max}(\alpha x + \beta, 0), 1)

message ActivationSigmoidHard {
    float alpha = 1;
    float beta = 2;
}

ActivationPReLU

A parameterized rectified linear unit (PReLU) activation function, which takes [C] or [C,H,W] as an input and applies different parameters in each channel dimension (shared across the H and W components).

This function has the following formula:

f(x_i) = \begin{cases}
             x_i          & \text{if } x_i \geq 0 \\
             \alpha_i x_i & \text{if } x_i < 0
         \end{cases} \;,\;i=1,...,C

message ActivationPReLU {
    // parameter of length C or 1.
    // If length is 1, same value is used for all channels
    WeightParams alpha = 1;
}

ActivationELU

An exponential linear unit (ELU) activation function.

This function has the following formula:

f(x) = \begin{cases}
        x              & \text{if } x \geq 0 \\
        \alpha (e^x - 1) & \text{if } x < 0
       \end{cases}

message ActivationELU {
    float alpha = 1;
}

ActivationThresholdedReLU

A thresholded rectified linear unit (ReLU) activation function.

This function has the following formula:

f(x) = \begin{cases}
        x & \text{if } x \geq \alpha \\
        0 & \text{if } x < \alpha
       \end{cases}

message ActivationThresholdedReLU {
    float alpha = 1;
}

ActivationSoftsign

A softsign activation function.

This function has the following formula:

f(x) = \dfrac{x}{1 + |x|}

message ActivationSoftsign {
}

ActivationSoftplus

A softplus activation function.

This function has the following formula:

f(x) = \text{log}(1 + e^x)

message ActivationSoftplus {
}

ActivationParametricSoftplus

A parametric softplus activation function, which takes [C] or [C,H,W] as an input and applies different parameters in each channel dimension (shared across the H and W components).

This function has the following formula:

f(x_i) = \alpha_i \text{log}(1 + e^{\beta_i x_i}) \;,\;i=1,...,C

message ActivationParametricSoftplus {
    // If length is 1, same value is used for all channels
    WeightParams alpha = 1; //parameter of length C or 1
    WeightParams beta = 2;  //parameter of length C or 1
}

ActivationParams

message ActivationParams {
    oneof NonlinearityType {
        ActivationLinear linear = 5;

        ActivationReLU ReLU = 10;
        ActivationLeakyReLU leakyReLU = 15;
        ActivationThresholdedReLU thresholdedReLU = 20;
        ActivationPReLU PReLU = 25;

        ActivationTanh tanh = 30;
        ActivationScaledTanh scaledTanh = 31;

        ActivationSigmoid sigmoid = 40;
        ActivationSigmoidHard sigmoidHard = 41;

        ActivationELU ELU = 50;

        ActivationSoftsign softsign = 60;
        ActivationSoftplus softplus = 70;
        ActivationParametricSoftplus parametricSoftplus = 71;
    }
}

NeuralNetworkLayer

A single neural network layer.

message NeuralNetworkLayer {
    string name = 1;  //descriptive name of the layer
    repeated string input = 2;
    repeated string output = 3;

    oneof layer {
        // start at 100 here
        ConvolutionLayerParams convolution = 100;

        PoolingLayerParams pooling = 120;

        ActivationParams activation = 130;

        InnerProductLayerParams innerProduct = 140;
        EmbeddingLayerParams embedding = 150;

        //normalization related layers
        BatchnormLayerParams batchnorm = 160;
        MeanVarianceNormalizeLayerParams mvn = 165;
        L2NormalizeLayerParams l2normalize = 170;
        SoftmaxLayerParams softmax = 175;
        LRNLayerParams lrn = 180;

        CropLayerParams crop = 190;
        PaddingLayerParams padding = 200;
        UpsampleLayerParams upsample = 210;

        UnaryFunctionLayerParams unary = 220;

        //elementwise operations
        AddLayerParams add = 230;
        MultiplyLayerParams multiply = 231;

        AverageLayerParams average = 240;
        ScaleLayerParams scale = 245;

        BiasLayerParams bias = 250;
        MaxLayerParams max = 260;
        MinLayerParams min = 261;

        DotProductLayerParams dot = 270;
        ReduceLayerParams reduce = 280;
        LoadConstantLayerParams loadConstant = 290;

        //data reorganization
        ReshapeLayerParams reshape = 300;
        FlattenLayerParams flatten = 301;
        PermuteLayerParams permute = 310;
        ConcatLayerParams concat = 320;
        SplitLayerParams split = 330;
        SequenceRepeatLayerParams sequenceRepeat = 340;

        ReorganizeDataLayerParams reorganizeData = 345;
        SliceLayerParams slice = 350;

        //Recurrent Layers
        SimpleRecurrentLayerParams simpleRecurrent = 400;
        GRULayerParams gru = 410;
        UniDirectionalLSTMLayerParams uniDirectionalLSTM = 420;
        BiDirectionalLSTMLayerParams biDirectionalLSTM = 430;

        // Custom (user-implemented) Layer
        CustomLayerParams custom = 500;
    }
}

BorderAmounts

Specifies the amount of spatial border to be either padded or cropped.

For padding:

H_out = borderAmounts[0].startEdgeSize + H_in + borderAmounts[0].endEdgeSize
W_out = borderAmounts[1].startEdgeSize + W_in + borderAmounts[1].endEdgeSize

topPaddingAmount == Height startEdgeSize
bottomPaddingAmount == Height endEdgeSize
leftPaddingAmount == Width startEdgeSize
rightPaddingAmount == Width endEdgeSize

For cropping:

H_out = (-borderAmounts[0].startEdgeSize) + H_in + (-borderAmounts[0].endEdgeSize)
W_out = (-borderAmounts[1].startEdgeSize) + W_in + (-borderAmounts[1].endEdgeSize)

topCropAmount == Height startEdgeSize
bottomCropAmount == Height endEdgeSize
leftCropAmount == Width startEdgeSize
rightCropAmount == Width endEdgeSize
message BorderAmounts {
    message EdgeSizes {
        uint64 startEdgeSize = 1;

        uint64 endEdgeSize = 2;
    }

    repeated EdgeSizes borderAmounts = 10;
}

BorderAmounts.EdgeSizes

message EdgeSizes {
    uint64 startEdgeSize = 1;

    uint64 endEdgeSize = 2;
}

ValidPadding

Specifies the type of padding to be used with Convolution/Deconvolution and Pooling layers. After padding, input spatial shape: [H_in, W_in], gets modified to the output spatial shape [H_out, W_out].

topPaddingAmount == Height startEdgeSize == borderAmounts[0].startEdgeSize
bottomPaddingAmount == Height endEdgeSize == borderAmounts[0].endEdgeSize
leftPaddingAmount == Width startEdgeSize == borderAmounts[1].startEdgeSize
rightPaddingAmount == Width endEdgeSize == borderAmounts[1].endEdgeSize

With Convolution or Pooling:

H_out = int_division_round_down((H_in + topPaddingAmount + bottomPaddingAmount - KernelSize[0]),stride[0]) + 1

which is same as:

H_out = int_division_round_up((H_in + topPaddingAmount + bottomPaddingAmount - KernelSize[0] + 1),stride[0])

With Deconvolution:

H_out = (H_in-1) * stride[0] + kernelSize[0] - (topPaddingAmount + bottomPaddingAmount)

The equivalent expressions hold true for W_out as well.

By default, the values of paddingAmounts are set to 0, which results in a “true” valid padding. If non-zero values are provided for paddingAmounts, “valid” convolution/pooling is performed within the spatially expanded input.

message ValidPadding {
    BorderAmounts paddingAmounts = 1;
}

SamePadding

Specifies the type of padding to be used with Convolution/Deconvolution and pooling layers. After padding, input spatial shape: [H_in, W_in], gets modified to the output spatial shape [H_out, W_out]. With Convolution or pooling:

H_out = int_division_round_up(H_in,stride[0])
W_out = int_division_round_up(W_in,stride[1])

This is achieved by using the following padding amounts:

totalPaddingHeight = max(0,(H_out-1) * stride[0] + KernelSize[0] - Hin)
totalPaddingWidth = max(0,(W_out-1) * stride[1] + KernelSize[1] - Win)

There are two modes of asymmetry: BOTTOM_RIGHT_HEAVY, and TOP_LEFT_HEAVY.

If the mode is BOTTOM_RIGHT_HEAVY:

topPaddingAmount = floor(totalPaddingHeight / 2)
bottomPaddingAmount = totalPaddingHeight - topPaddingAmount
leftPaddingAmount = floor(totalPaddingWidth / 2)
rightPaddingAmount = totalPaddingWidth - leftPaddingAmount

If the mode is TOP_LEFT_HEAVY:

bottomPaddingAmount = floor(totalPaddingHeight / 2)
topPaddingAmount = totalPaddingHeight - bottomPaddingAmount
rightPaddingAmount = floor(totalPaddingWidth / 2)
leftPaddingAmount = totalPaddingWidth - rightPaddingAmount

With Deconvolution:

H_out = H_in * stride[0]
W_out = W_in * stride[1]
message SamePadding {
    enum SamePaddingMode {
        BOTTOM_RIGHT_HEAVY = 0;
        TOP_LEFT_HEAVY = 1;
    }
    SamePaddingMode asymmetryMode = 1;
}

WeightParams

Weights for layer parameters. Weights are stored as repeated floating point numbers using row-major ordering and can represent 1-, 2-, 3-, or 4-dimensional data.

message WeightParams {
    repeated float floatValue = 1;

    bytes float16Value = 2;

    bytes rawValue  = 30;

    QuantizationParams quantization = 40;

}

QuantizationParams

Quantization parameters.

message QuantizationParams {
    uint64 numberOfBits = 1;
    oneof QuantizationType {
        LinearQuantizationParams linearQuantization = 101;
        LookUpTableQuantizationParams lookupTableQuantization = 102;
    }
}

LinearQuantizationParams

message LinearQuantizationParams {
    repeated float scale = 1;
    repeated float bias = 2;
}

LookUpTableQuantizationParams

message LookUpTableQuantizationParams {
    (2^numberOfBits) Elements.
    repeated float floatValue = 1;
}

ConvolutionLayerParams

A layer that performs spatial convolution or deconvolution.

y = ConvolutionLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [inputChannels,inputHeight,inputWidth] or [C_in, H_in, W_in].
Output
A blob with shape [outputChannels,outputHeight,outputWidth] or [C_out, H_out, W_out].

If dilationFactor is not 1, effective kernel size is modified as follows:

KernelSize[0] <-- (kernelSize[0]-1) * dilationFactor[0] + 1
KernelSize[1] <-- (kernelSize[1]-1) * dilationFactor[1] + 1

Type of padding can be valid or same. Output spatial dimensions depend on the the type of padding. For details, refer to the descriptions of the messages “ValidPadding” and “SamePadding”. Padded values are all zeros.

For Deconvolution, ConvolutionPaddingType (valid or same) is ignored when outputShape is set.

message ConvolutionLayerParams {
    uint64 outputChannels = 1;

    uint64 kernelChannels = 2;

    uint64 nGroups = 10;

    repeated uint64 kernelSize = 20;

    repeated uint64 stride = 30;

    repeated uint64 dilationFactor = 40;

    oneof ConvolutionPaddingType {
        ValidPadding valid = 50;
        SamePadding same = 51;
    }

    bool isDeconvolution = 60;

    bool hasBias = 70;

    WeightParams weights = 90;
    WeightParams bias = 91;

    repeated uint64 outputShape = 100;
}

InnerProductLayerParams

A layer that performs a matrix vector product. This is equivalent to a fully-connected, or dense layer.

y = InnerProductLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C_in] or [C_in, 1, 1], where C_in is equal to inputChannels.
Output
A blob with shape [C_out], where C_out is equal to outputChannels.
message InnerProductLayerParams {
    uint64 inputChannels = 1;
    uint64 outputChannels = 2;

    bool hasBias = 10;

    WeightParams weights = 20;
    WeightParams bias = 21;
}

EmbeddingLayerParams

A layer that performs a matrix lookup and optionally adds a bias.

y = EmbeddingLayer(x)

Requires 1 input and produces 1 output.

Input
A sequence of integers with shape [1] or [1, 1, 1], (equivalent to [Seq_length, 1, 1, 1]). Input values must be in the range [0, inputDim - 1].
Output
A sequence of 1-dimensional features of size outputChannels (equivalent to [Seq_length, outputChannels, 1, 1]).
message EmbeddingLayerParams {
    uint64 inputDim = 1;
    uint64 outputChannels = 2;

    bool hasBias = 10;

    WeightParams weights = 20;
    WeightParams bias = 21;
}

BatchnormLayerParams

A layer that performs batch normalization, which is performed along the channel axis, and repeated along the other axes, if present.

y = BatchnormLayer(x)

Requires 1 input and produces 1 output.

This operation is described by the following formula:

y_i = \gamma_i \dfrac{ (x_i - \mu_i)}{\sqrt{\sigma_i^2 + \epsilon}} + \beta_i \;,\;i=1,....,C

Input
A blob with shape [C] or [C, H, W].
Output
A blob with the same shape as the input.
message BatchnormLayerParams {
    uint64 channels = 1;

    bool computeMeanVar = 5;
    bool instanceNormalization = 6;

    float epsilon = 10;

    WeightParams gamma=15;
    WeightParams beta=16;
    WeightParams mean=17;
    WeightParams variance=18;
}

PoolingLayerParams

A spatial pooling layer.

y = PoolingLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C, H_in, W_in].
Output
A blob with shape [C, H_out, W_out].

Padding options are similar to ConvolutionLayerParams with the additional option of ValidCompletePadding (includeLastPixel), which ensures that the last application of the kernel always includes the last pixel of the input image, if there is padding.

H_out = int_division_round_up((H_in + 2 * paddingAmounts[0] - kernelSize[0]),Stride[0]) + 1)
if (paddingAmounts[0] > 0 or paddingAmounts[1] > 0)
     if ((H_out - 1) * Stride >= H_in + paddingAmounts[0]) {
         H_out = H_out - 1
     }
}

The equivalent expressions hold true for W_out as well. Only symmetric padding is supported with this option.

message PoolingLayerParams {
    enum PoolingType{
        MAX = 0;
        AVERAGE = 1;
        L2 = 2;
    }
    PoolingType type = 1;

    repeated uint64 kernelSize = 10;

    repeated uint64 stride = 20;

    message ValidCompletePadding {
        repeated uint64 paddingAmounts = 10;
    }

    oneof PoolingPaddingType {
        ValidPadding valid = 30;
        SamePadding same = 31;
        ValidCompletePadding includeLastPixel = 32;
    }

    bool avgPoolExcludePadding = 50;

    bool globalPooling = 60;
}

PoolingLayerParams.ValidCompletePadding

message ValidCompletePadding {
    repeated uint64 paddingAmounts = 10;
}

PaddingLayerParams

A layer that performs padding along spatial dimensions.

y = PaddingLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C, H_in, W_in].
Output
A blob with shape [C, H_out, W_out].

Output dimensions are calculated as follows:

H_out = H_in + topPaddingAmount + bottomPaddingAmount
W_out = W_in + leftPaddingAmount + rightPaddingAmount

topPaddingAmount == Height startEdgeSize == borderAmounts[0].startEdgeSize
bottomPaddingAmount == Height endEdgeSize == borderAmounts[0].endEdgeSize
leftPaddingAmount == Width startEdgeSize == borderAmounts[1].startEdgeSize
rightPaddingAmount == Width endEdgeSize == borderAmounts[1].endEdgeSize

There are three types of padding:

  • PaddingConstant, which fills a constant value at the border.
  • PaddingReflection, which reflects the values at the border.
  • PaddingReplication, which replicates the values at the border.

Given the following input:

[1, 3, 4]  :  1   2   3   4
              5   6   7   8
              9   10  11  12

Here is the output of applying the padding (top=2, left=2, bottom=0, right=0) with each of the supported types:

  • PaddingConstant (value = 0): .. code:

    [1, 5, 6]  :  0   0   0  0   0   0
                  0   0   0  0   0   0
                  0   0   1  2   3   4
                  0   0   5  6   7   8
                  0   0   9  10  11  12
    
  • PaddingReflection: .. code:

    [1, 5, 6]  :  11  10  9  10  11  12
                  7   6   5  6   7   8
                  3   2   1  2   3   4
                  7   6   5  6   7   8
                  11  10  9  10  11  12
    
  • PaddingReplication: .. code:

    [1, 5, 6]  :  1   1   1  2   3   4
                  1   1   1  2   3   4
                  1   1   1  2   3   4
                  5   5   5  6   7   8
                  9   9   9  10  11  12
    
message PaddingLayerParams {
    message PaddingConstant {
        float value = 1;
    }

    message PaddingReflection {
    }

    message PaddingReplication {
    }

    oneof PaddingType {
        PaddingConstant constant = 1;
        PaddingReflection reflection = 2;
        PaddingReplication replication = 3;
    }

    BorderAmounts paddingAmounts = 10;
}

PaddingLayerParams.PaddingConstant

Fill a constant value in the padded region.

message PaddingConstant {
    float value = 1;
}

PaddingLayerParams.PaddingReflection

Reflect the values at the border for padding.

message PaddingReflection {
}

PaddingLayerParams.PaddingReplication

Replicate the values at the border for padding.

message PaddingReplication {
}

ConcatLayerParams

A layer that concatenates along the channel axis (default) or sequence axis.

y = ConcatLayer(x1,x2,....)

Requires more than 1 input and produces 1 output.

The input and output formats are dependent on sequenceConcat.

If sequenceConcat == true:

Input
Sequences of length Seq_i of blobs with shape [C, H, W].
Output
A Sequence of length summation(Seq_i) of blobs with shape [C, H, W].

If sequenceConcat == false:

Input
A blob with shape [C_i, H, W], where i = 1, 2, ....
Output
A blob with shape [summation(C_i), H, W].
message ConcatLayerParams {
    bool sequenceConcat = 100;
}

LRNLayerParams

A layer that performs local response normalization (LRN).

y = LRNLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C, H, W]
Output
A blob with the same shape as the input.

This layer is described by the following formula:

x_i \leftarrow  \dfrac{x_i}{\left ( k + \dfrac{\alpha}{C} \sum_j x_j^2 \right )^\beta}

where the summation is done over a (localSize, 1, 1) neighborhood — that is, over a window “across” channels in 1x1 spatial neighborhoods.

message LRNLayerParams {
    float alpha = 1;
    float beta = 2;
    uint64 localSize = 3;
    float k = 4;
}

SoftmaxLayerParams

Softmax Normalization Layer

A layer that performs softmax normalization. Normalization is done along the channel axis.

y = SoftmaxLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C] or [C, H, W].
Output
A blob with the same shape as the input.

This layer is described by the following formula:

x_i \leftarrow \dfrac{e^{x_i}}{\sum_i{e^{x_i}}}

message SoftmaxLayerParams {
}

SplitLayerParams

A layer that uniformly splits across the channel dimension to produce a specified number of outputs.

(y1,y2,...yN) = SplitLayer(x), where N = nOutputs

Requires 1 input and produces multiple outputs.

Input
A blob with shape [C] or [C, H, W]
Output
nOutputs blobs with shapes [C/nOutputs] or [C/nOutputs, H, W]
message SplitLayerParams {
    uint64 nOutputs=1;
}

AddLayerParams

A layer that performs elementwise addition.

y = AddLayer(x1,x2,...)

Requires 1 or more than 1 input and produces 1 output.

Input
One or more blobs with broadcastable shapes [1], [C], [1, H, W], or [C, H, W].
Output
A blob with shape equal to the input blob.

If only one input is provided, scalar addition is performed:

y = x + \alpha

message AddLayerParams {
    float alpha = 1;
}

MultiplyLayerParams

A layer that performs elementwise multiplication.

y = MultiplyLayer(x1,x2,...)

Requires 1 or more than 1 input and produces 1 output.

Input
One or more blobs with broadcastable shapes [1], [C], [1, H, W], or [C, H, W].
Output
A blob with shape equal to the first input blob.

If only one input is provided, scalar multiplication is performed:

y = \alpha x

message MultiplyLayerParams {
    float alpha = 1;
}

UnaryFunctionLayerParams

A layer that applies a unary function.

y = UnaryFunctionLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C] or [C, H, W].
Output
A blob with the same shape as the input.

The input is first modified by shifting and scaling:

x \leftarrow \text{scale} \cdot x + \text{shift}

message UnaryFunctionLayerParams {
    enum Operation{
        SQRT = 0;
        RSQRT = 1;
        INVERSE = 2;
        POWER = 3;
        EXP = 4;
        LOG = 5;
        ABS = 6;
        THRESHOLD = 7;
    }
    Operation type = 1;

    float alpha = 2;

    float epsilon = 3;

    float shift = 4;

    float scale = 5;
}

UpsampleLayerParams

A layer that scales up spatial dimensions. It supports two modes: nearest neighbour (default) and bilinear.

y = UpsampleLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C, H, W].
Output
A blob with shape [C, scalingFactor[0] * H, scalingFactor[1] * W]
message UpsampleLayerParams {
    repeated uint64 scalingFactor = 1;

    enum InterpolationMode {
        NN = 0;
        BILINEAR = 1;
    }

    InterpolationMode mode = 5;

}

BiasLayerParams

A layer that performs elementwise addition of a bias, which is broadcasted to match the input shape.

y = BiasLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C, H, W].
Output
A blob with the same shape as the input.
message BiasLayerParams {
    repeated uint64 shape = 1;

    WeightParams bias = 2;
}

ScaleLayerParams

A layer that performs elmentwise multiplication by a scale factor and optionally adds a bias; both the scale and bias are broadcasted to match the input shape.

y = ScaleLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C, H, W].
Output
A blob with the same shape as the input.
message ScaleLayerParams {
    repeated uint64 shapeScale = 1;

    WeightParams scale = 2;

    bool hasBias = 3;

    repeated uint64 shapeBias = 4;

    WeightParams bias = 5;
}

LoadConstantLayerParams

A layer that loads data as a parameter and provides it as an output.

y = LoadConstantLayer()

Takes no input. Produces 1 output.

Input
None
Output:
A blob with shape [C, H, W]
message LoadConstantLayerParams {
    repeated uint64 shape = 1;

    WeightParams data = 2;
}

L2NormalizeLayerParams

A layer that performs L2 normalization, i.e. divides by the the square root of the sum of squares of all elements of input.

y = L2NormalizeLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C] or [C, H, W].
Output
A blob with the same shape as the input.

This layer is described by the following formula:

x_i \leftarrow \dfrac{x_i}{\sqrt{\sum{x_i^2} + \epsilon}}

message L2NormalizeLayerParams {
    float epsilon = 1;
}

FlattenLayerParams

A layer that flattens the input.

y = FlattenLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C, H, W].
Output
A blob with shape [C * H * W, 1, 1]

There are two flatten orders: CHANNEL_FIRST and CHANNEL_LAST. CHANNEL_FIRST does not require data to be rearranged, because row major ordering is used by internal storage. CHANNEL_LAST requires data to be rearranged.

message FlattenLayerParams {
    enum FlattenOrder {
        CHANNEL_FIRST = 0;
        CHANNEL_LAST = 1;
    }
    FlattenOrder mode = 1;
}

ReshapeLayerParams

A layer that recasts the input into a new shape.

y = ReshapeLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C, H, W] or [Seq, C, H, W].
Output
A blob with shape [C_out, H_out, W_out] or [Seq_out, C_out, H_out, W_out].

There are two reshape orders: CHANNEL_FIRST and CHANNEL_LAST. CHANNEL_FIRST is equivalent to flattening the input to [C * H * W, 1, 1] in channel first order and then reshaping it to the target shape; no data rearrangement is required. CHANNEL_LAST is equivalent to flattening the input to [H * W * C, 1, 1] in channel last order, reshaping it to [H_out, W_out, C_out] (it is now in “H_out-major”” order), and then permuting it to [C_out, H_out, W_out]; both the flattening and permuting requires the data to be rearranged.

message ReshapeLayerParams {
    repeated int64 targetShape = 1;

    enum ReshapeOrder {
        CHANNEL_FIRST = 0;
        CHANNEL_LAST = 1;
    }
    ReshapeOrder mode = 2;
}

PermuteLayerParams

A layer that rearranges the dimensions and data of an input.

y = PermuteLayer(x)

Requires 1 input and produces 1 output.

Input
A sequence of 3-dimensional blobs. InputShape = [Seq, C, H, W].
Output
A sequence of a different length of 3-dimensional blobs. Shape: [InputShape[axis[0]], InputShape[axis[1]], InputShape[axis[2]], InputShape[axis[3]]]. Hence output is a sequence of length InputShape[axis[0]].

Examples:

  • If axis is set to [0, 3, 1, 2], then the output has shape [W,C,H] and has the same sequence length that of the input.
  • If axis is set to [3, 1, 2, 0], and the input is a sequence of data with length Seq and shape [C, 1, 1], then the output is a unit sequence of data with shape [C, 1, Seq].
  • If axis is set to [0, 3, 2, 1], the output is a reverse of the input: [C, H, W] -> [W, H, C].
  • If axis is not set, or is set to [0, 1, 2, 3], the output is the same as the input.
message PermuteLayerParams {
    repeated uint64 axis = 1;
}

ReorganizeDataLayerParams

A layer that reorganizes data in the input in specific ways.

y = ReorganizeDataLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C, H, W].
Output
A blob with shape [C_out, H_out, W_out].
mode == SPACE_TO_DEPTH
[C_out, H_out, W_out] : [C * blockSize * blockSize, H/blockSize, W/blockSize]. blockSize must divide H and W. Data is moved from the spatial dimensions to the channel dimension. Input is spatially divided into non-overlapping blocks of size blockSize X blockSize and data from each block is moved into the channel dimension.
mode == DEPTH_TO_SPACE
[C_out, H_out, W_out] : [C/(blockSize * blockSize), H * blockSize, W * blockSize]. Square of blockSize must divide C. Reverse of SPACE_TO_DEPTH. Data is moved from the channel dimension to the spatial dimensions.
message ReorganizeDataLayerParams {

    enum ReorganizationType {
        SPACE_TO_DEPTH = 0;
        DEPTH_TO_SPACE = 1;
    }
    ReorganizationType mode = 1;
    uint64 blockSize = 2;
}

SliceLayerParams

A layer that slices the input data along a given axis.

y = SliceLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [Seq, C, H, W].
Output
A blob with shape [Seq_out, C_out, H_out, W_out].

Sliced section is taken from the interval [startIndex, endIndex), i.e. startIndex is inclusive while endIndex is exclusive. stride must be positive and represents the step size for slicing. startIndex must be non-negative. Negative indexing is supported for endIndex: -1 denotes N, -2 denotes N-1 and so on, where N is the length of the dimension to be sliced.

message SliceLayerParams {

    int64 startIndex = 1;
    int64 endIndex = 2;
    uint64 stride = 3;

    enum SliceAxis {
        CHANNEL_AXIS = 0;
        HEIGHT_AXIS = 1;
        WIDTH_AXIS = 2;
    }
    SliceAxis axis = 4;

}

ReduceLayerParams

A layer that reduces the input using a specified operation.

y = ReduceLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C, H, W].
Output
A blob whose shape depends on the value of axis, the dimension(s) along which reduction is performed. if axis == C : [1, H, W] if axis == H : [C, 1, W] if axis == W : [C, H, 1] if axis == HW : [C, 1, 1] if axis == CHW : [1, 1, 1] [Default]
message ReduceLayerParams {
    enum ReduceOperation {
        SUM = 0;
        AVG = 1;
        PROD = 2;
        LOGSUM = 3;
        SUMSQUARE = 4;
        L1 = 5;
        L2 = 6;
        MAX = 7;
        MIN = 8;
        ARGMAX = 9;
    }
    ReduceOperation mode = 1;

    float epsilon = 2;

    enum ReduceAxis {
        CHW = 0;
        HW = 1;
        C = 2;
        H = 3;
        W = 4;
    }

    ReduceAxis axis = 3;

}

CropLayerParams

A layer that crops the spatial dimensions of an input. If two inputs are provided, the shape of the second input is used as the reference shape.

y = CropLayer(x1) or y = CropLayer(x1,x2)

Requires 1 or 2 inputs and produces 1 output.

Input
  • 1 input case: A blob with shape [C, H_in, W_in].
  • 2 input case: 1st blob with shape [C, H_in, W_in], 2nd blob with shape [C, H_out, W_out].
Output
A blob with shape [C, H_out, W_out].

If one input is used, output is computed as follows:

y = x1[:, topCropAmount:H_in - bottomCropAmount, leftCropAmount:W_in - rightCropAmount]

topCropAmount == Height startEdgeSize == borderAmounts[0].startEdgeSize
bottomCropAmount == Height endEdgeSize == borderAmounts[0].endEdgeSize
leftCropAmount == Width startEdgeSize == borderAmounts[1].startEdgeSize
rightCropAmount == Width endEdgeSize == borderAmounts[1].endEdgeSize

H_out = H_in - topCropAmount - bottomCropAmount
W_out = W_in - leftCropAmount - rightCropAmount

If two inputs are used, output is computed as follows:

y = x1[:, offset[0]:offset[0] + H_out, offset[1]:offset[1] + W_out]
message CropLayerParams {
    BorderAmounts cropAmounts = 1;

    repeated uint64 offset = 5;
}

AverageLayerParams

A layer that computes the elementwise average of the inputs.

y = AverageLayer(x1,x2,...)

Requires multiple inputs and produces 1 output.

Input
Multiple blobs with broadcastable shapes [1], [C], [1, H, W], or [C, H, W].
Output
A blob with the same shape as each input.
message AverageLayerParams {
}

MaxLayerParams

A layer that computes the elementwise maximum over the inputs.

y = MaxLayer(x1,x2,...)

Requires multiple inputs and produces 1 output.

Input
Multiple blobs, each with shape [C] or [C, H, W].
Output
A blob with the same shape as each input.
message MaxLayerParams {
}

MinLayerParams

A layer that computes the elementwise minimum over the inputs.

y = MinLayer(x1,x2,...)

Requires multiple inputs and produces 1 output.

Input
Multiple blobs, each with shape [C] or [C, H, W].
Output
A blob with the same shape as each input.
message MinLayerParams {
}

DotProductLayerParams

A layer that computes the dot product of two vectors.

y = DotProductLayer(x1,x2)

Requires 2 inputs and produces 1 output.

Input
Two blobs with shape [C].
Output
A scalar.
message DotProductLayerParams {
    bool cosineSimilarity = 1;
}

MeanVarianceNormalizeLayerParams

A layer that performs mean variance normalization.

y = MeanVarianceNormalizeLayer(x)

Requires 1 input and produces 1 output.

Input
A blob with shape [C] or [C, H, W].
Output
A blob with the same shape as the input.

If acrossChannels == true normalization is performed on flattened input.

If acrossChannels == false normalization is performed within a channel, across spatial dimensions.

message MeanVarianceNormalizeLayerParams {
    bool acrossChannels = 1;

    bool normalizeVariance = 2;

    float epsilon = 3;
}

SequenceRepeatLayerParams

A layer that repeats a sequence.

y = SequenceRepeatLayer(x)

Requires 1 input and produces 1 output.

Input
A sequence of blobs, i.e. shape is either [Seq, C] or [Seq, C, H, W].
Output
A sequence of length nRepetitions * Seq with shape corresponding to the input, i.e. shape is either [nRepetitions * Seq, C] or [nRepetitions * Seq, C, H, W].
message SequenceRepeatLayerParams {
    uint64 nRepetitions = 1;
}

SimpleRecurrentLayerParams

A simple recurrent layer.

y_t = SimpleRecurrentLayer(x_t, y_{t-1})
Input
A sequence of vectors of size inputVectorSize with shape [Seq, inputVectorSize].
Output
A vector of size outputVectorSize. It is either the final output or a sequence of outputs at all time steps.
  • Output Shape: [1,outputVectorSize] , if sequenceOutput == false
  • Output Shape: [Seq,outputVectorSize] , if sequenceOutput == true

This layer is described by the following equation:

\boldsymbol{y_t} = f(\mathrm{clip}(W \boldsymbol{x_t} + \
                                   R \boldsymbol{y_{t-1}} + b))

  • W is a 2-dimensional weight matrix ([outputVectorSize, inputVectorSize], row-major)
  • R is a 2-dimensional recursion matrix ([outputVectorSize, outputVectorSize], row-major)
  • b is a 1-dimensional bias vector ([outputVectorSize])
  • f() is an activation
  • clip() is a function that constrains values between [-50.0, 50.0]
message SimpleRecurrentLayerParams {
    uint64 inputVectorSize = 1;
    uint64 outputVectorSize = 2;

    ActivationParams activation = 10;

        If false output is just the result after final state update.
        If true, output is a sequence, containing outputs at all time steps.
    bool sequenceOutput = 15;

    bool hasBiasVector = 20;

    WeightParams weightMatrix = 30;
    WeightParams recursionMatrix = 31;
    WeightParams biasVector = 32;

    bool reverseInput = 100;
    // If true, then the node processes the input sequence from right to left
}

GRULayerParams

Gated-Recurrent Unit (GRU) Layer

y_t = GRULayer(x_t, y_{t-1})
Input
A sequence of vectors of size inputVectorSize with shape [Seq, inputVectorSize].
Output
A vector of size outputVectorSize. It is either the final output or a sequence of outputs at all time steps.
  • Output Shape: [1,outputVectorSize] , if sequenceOutput == false
  • Output Shape: [Seq,outputVectorSize] , if sequenceOutput == true

This layer is described by the following equations:

Update Gate

\boldsymbol{z_t} = \
    f(\mathrm{clip}(W_z \boldsymbol{x_t} + \
                    R_z \boldsymbol{y_{t-1}} + b_z)

Reset Gate

\boldsymbol{r_t} = \
    f(\mathrm{clip}(W_r \boldsymbol{x_t} + \
                    R_r \boldsymbol{y_{t-1}} + b_r))

Cell Memory State

\boldsymbol{c_t} = \
    \boldsymbol{y_{t-1}} \odot \boldsymbol{r_t}

Output Gate

\boldsymbol{o_t} = \
    g(\mathrm{clip}(W_o \boldsymbol{x_t} + \
                    R_o \boldsymbol{c_t} + b_o))

Output

\boldsymbol{y_t} = \
    (1 - \boldsymbol{z_t}) \odot \boldsymbol{o_t} + \
     \boldsymbol{z_t} \odot \boldsymbol{y_{t-1}}

  • W_z, W_r, W_o are 2-dimensional input weight matrices ([outputVectorSize, inputVectorSize], row-major)
  • R_z, R_r, R_o are 2-dimensional recursion matrices ([outputVectorSize, outputVectorSize], row-major)
  • b_z, b_r, b_o are 1-dimensional bias vectors ([outputVectorSize])
  • f(), g() are activations
  • clip() is a function that constrains values between [-50.0, 50.0]
  • denotes the elementwise product of matrices
message GRULayerParams {
    uint64 inputVectorSize = 1;
    uint64 outputVectorSize = 2;

    repeated ActivationParams activations = 10;

    bool sequenceOutput = 15;

    bool hasBiasVectors = 20;

    WeightParams updateGateWeightMatrix = 30;
    WeightParams resetGateWeightMatrix = 31;
    WeightParams outputGateWeightMatrix = 32;

    WeightParams updateGateRecursionMatrix = 50;
    WeightParams resetGateRecursionMatrix = 51;
    WeightParams outputGateRecursionMatrix = 52;

    WeightParams updateGateBiasVector = 70;
    WeightParams resetGateBiasVector = 71;
    WeightParams outputGateBiasVector = 72;

    bool reverseInput = 100;
}

LSTMParams

Long short-term memory (LSTM) parameters.

This is described by the following equations:

Input Gate

\boldsymbol{i_t} = \
    f(\mathrm{clip}(W_i \boldsymbol{x_t} + \
                    R_i \boldsymbol{y_{t-1}} + \
                    p_i \odot c_{t-1} + b_i))

Forget Gate

\boldsymbol{f_t} = \
    f(\mathrm{clip}(W_f \boldsymbol{x_t} + \
                    R_f \boldsymbol{y_{t-1}} + \
                    p_f \odot c_{t-1} + b_f))

Block Input

\boldsymbol{z_t} = \
    g(\mathrm{clip}(W_z \boldsymbol{x_t} + \
                    R_z \boldsymbol{y_{t-1}} + b_z))

Cell Memory State

\boldsymbol{c_t} = \
    \boldsymbol{c_{t-1}} \odot \boldsymbol{f_t} + \
    \boldsymbol{i_t} \odot \boldsymbol{z_t}

Output Gate

\boldsymbol{o_t} = \
    f(\mathrm{clip}(W_o \boldsymbol{x_t} + \
                    R_o \boldsymbol{y_{t-1}} + \
                    p_o \odot c_t + b_o))

Output

\boldsymbol{y_t} = \
    h(\boldsymbol{c_t}) \odot \boldsymbol{o_t}

  • W_i, W_f, W_z, W_o are 2-dimensional input weight matrices ([outputVectorSize, inputVectorSize], row-major)
  • R_i, R_f, R_z, R_o are 2-dimensional recursion matrices ([outputVectorSize, outputVectorSize], row-major)
  • b_i, b_f, b_z, b_o are 1-dimensional bias vectors ([outputVectorSize])
  • p_, p_f, p_o are 1-dimensional peephole vectors ([outputVectorSize])
  • f(), g(), h() are activations
  • clip() is a function that constrains values between [-50.0, 50.0]
  • denotes the elementwise product of matrices
message LSTMParams {
    bool sequenceOutput = 10;

    bool hasBiasVectors = 20;

    bool forgetBias = 30;

    bool hasPeepholeVectors = 40;

    bool coupledInputAndForgetGate = 50;

    float cellClipThreshold = 60;
}

LSTMWeightParams

Weights for long short-term memory (LSTM) layers

message LSTMWeightParams {
    WeightParams inputGateWeightMatrix = 1;
    WeightParams forgetGateWeightMatrix = 2;
    WeightParams blockInputWeightMatrix = 3;
    WeightParams outputGateWeightMatrix = 4;

    WeightParams inputGateRecursionMatrix = 20;
    WeightParams forgetGateRecursionMatrix = 21;
    WeightParams blockInputRecursionMatrix = 22;
    WeightParams outputGateRecursionMatrix = 23;

    //biases:
    WeightParams inputGateBiasVector = 40;
    WeightParams forgetGateBiasVector = 41;
    WeightParams blockInputBiasVector = 42;
    WeightParams outputGateBiasVector = 43;

    //peepholes:
    WeightParams inputGatePeepholeVector = 60;
    WeightParams forgetGatePeepholeVector = 61;
    WeightParams outputGatePeepholeVector = 62;
}

UniDirectionalLSTMLayerParams

A unidirectional long short-term memory (LSTM) layer.

(y_t, c_t) = UniDirectionalLSTMLayer(x_t, y_{t-1}, c_{t-1})
Input
A sequence of vectors of size inputVectorSize with shape [Seq, inputVectorSize].
Output
A vector of size outputVectorSize. It is either the final output or a sequence of outputs at all time steps.
  • Output Shape: [1,outputVectorSize] , if sequenceOutput == false
  • Output Shape: [Seq,outputVectorSize] , if sequenceOutput == true
message UniDirectionalLSTMLayerParams {
    uint64 inputVectorSize = 1;
    uint64 outputVectorSize = 2;

    repeated ActivationParams activations = 10;

    LSTMParams params = 15;

    LSTMWeightParams weightParams = 20;

    bool reverseInput = 100;
}

BiDirectionalLSTMLayerParams

Bidirectional long short-term memory (LSTM) layer

(y_t, c_t, y_t_reverse, c_t_reverse) = BiDirectionalLSTMLayer(x_t, y_{t-1}, c_{t-1}, y_{t-1}_reverse, c_{t-1}_reverse)
Input
A sequence of vectors of size inputVectorSize with shape [Seq, inputVectorSize].
Output
A vector of size 2 * outputVectorSize. It is either the final output or a sequence of outputs at all time steps.
  • Output Shape: [1, 2 * outputVectorSize] , if sequenceOutput == false
  • Output Shape: [Seq, 2 * outputVectorSize] , if sequenceOutput == true

The first LSTM operates on the input sequence in the forward direction. The second LSTM operates on the input sequence in the reverse direction.

Example: given the input sequence [x_1, x_2, x_3], where x_i are vectors at time index i:

The forward LSTM output is [yf_1, yf_2, yf_3],

where yf_i are vectors of size outputVectorSize:

  • yf_1 is the output at the end of sequence {x_1}
  • yf_2 is the output at the end of sequence {x_1, x_2}
  • yf_3 is the output at the end of sequence {x_1, x_2, x_3}

The backward LSTM output: [yb_1, yb_2, yb_3],

where yb_i are vectors of size outputVectorSize:

  • yb_1 is the output at the end of sequence {x_3}
  • yb_2 is the output at the end of sequence {x_3, x_2}
  • yb_3 is the output at the end of sequence {x_3, x_2, x_1}

Output of the bi-dir layer:

  • if sequenceOutput = True : { [yf_1, yb_3], [yf_2, yb_2], [yf_3, yb_1] }
  • if sequenceOutput = False : { [yf_3, yb_3] }
message BiDirectionalLSTMLayerParams {
    uint64 inputVectorSize = 1;
    uint64 outputVectorSize = 2;

    repeated ActivationParams activationsForwardLSTM = 10;
    repeated ActivationParams activationsBackwardLSTM = 11;

    LSTMParams params = 15;

    repeated LSTMWeightParams weightParams = 20;
}

CustomLayerParams

message CustomLayerParams {

    message CustomLayerParamValue {
        oneof value {
            double doubleValue = 10;
            string stringValue = 20;
            int32 intValue = 30;
            int64 longValue = 40;
            bool boolValue = 50;
        }
    }

    string className = 10; // The name of the class (conforming to MLCustomLayer) corresponding to this layer
    repeated WeightParams weights = 20; // Any weights -- these are serialized in binary format and memmapped at runtime
    map<string, CustomLayerParamValue> parameters = 30; // these may be handled as strings, so this should not be large
    string description = 40; // An (optional) description of the layer provided by the model creator. This information is displayed when viewing the model, but does not affect the model's execution on device.

}

CustomLayerParams.CustomLayerParamValue

message CustomLayerParamValue {
    oneof value {
        double doubleValue = 10;
        string stringValue = 20;
        int32 intValue = 30;
        int64 longValue = 40;
        bool boolValue = 50;
    }
}

CustomLayerParams.ParametersEntry

message CustomLayerParamValue {
    oneof value {
        double doubleValue = 10;
        string stringValue = 20;
        int32 intValue = 30;
        int64 longValue = 40;
        bool boolValue = 50;
    }
}

NeuralNetworkClassifier

A neural network specialized as a classifier.

message NeuralNetworkClassifier {
    repeated NeuralNetworkLayer layers = 1;
    repeated NeuralNetworkPreprocessing preprocessing = 2;

    oneof ClassLabels {
        StringVector stringClassLabels = 100;
        Int64Vector int64ClassLabels = 101;
    }

    string labelProbabilityLayerName = 200;
}

NeuralNetworkRegressor

A neural network specialized as a regressor.

message NeuralNetworkRegressor {
    repeated NeuralNetworkLayer layers = 1;
    repeated NeuralNetworkPreprocessing preprocessing = 2;

}

FlattenLayerParams.FlattenOrder

enum FlattenOrder {
    CHANNEL_FIRST = 0;
    CHANNEL_LAST = 1;
}

PoolingLayerParams.PoolingType

enum PoolingType{
    MAX = 0;
    AVERAGE = 1;
    L2 = 2;
}

ReduceLayerParams.ReduceAxis

enum ReduceAxis {
    CHW = 0;
    HW = 1;
    C = 2;
    H = 3;
    W = 4;
}

ReduceLayerParams.ReduceOperation

enum ReduceOperation {
    SUM = 0;
    AVG = 1;
    PROD = 2;
    LOGSUM = 3;
    SUMSQUARE = 4;
    L1 = 5;
    L2 = 6;
    MAX = 7;
    MIN = 8;
    ARGMAX = 9;
}

ReorganizeDataLayerParams.ReorganizationType

enum ReorganizationType {
    SPACE_TO_DEPTH = 0;
    DEPTH_TO_SPACE = 1;
}

ReshapeLayerParams.ReshapeOrder

enum ReshapeOrder {
    CHANNEL_FIRST = 0;
    CHANNEL_LAST = 1;
}

SamePadding.SamePaddingMode

enum SamePaddingMode {
    BOTTOM_RIGHT_HEAVY = 0;
    TOP_LEFT_HEAVY = 1;
}

SliceLayerParams.SliceAxis

enum SliceAxis {
    CHANNEL_AXIS = 0;
    HEIGHT_AXIS = 1;
    WIDTH_AXIS = 2;
}

UnaryFunctionLayerParams.Operation

A unary operator.

The following functions are supported:

SQRT

f(x) = \sqrt{x}

RSQRT

f(x) = \dfrac{1}{\sqrt{x + \epsilon}}

INVERSE

f(x) = \dfrac{1}{x + \epsilon}

POWER

f(x) = x^\alpha

EXP

f(x) = e^x

LOG

f(x) = \log x

ABS

f(x) = |x|

THRESHOLD

f(x) = \text{max}(\alpha, x)

enum Operation{
    SQRT = 0;
    RSQRT = 1;
    INVERSE = 2;
    POWER = 3;
    EXP = 4;
    LOG = 5;
    ABS = 6;
    THRESHOLD = 7;
}

UpsampleLayerParams.InterpolationMode

enum InterpolationMode {
    NN = 0;
    BILINEAR = 1;
}