Neural Networks¶
A neural network is defined through a collection of layers and represents a directed acyclic graph (DAG). Each layer has a name, a layer type, a list of input names, a list of output names, and a collection of parameters specific to the layer type.
The graph structure and connectivity of the neural network
is inferred from the input and output names.
A neural network starts with the layer
whose input name is equal to the value specified in
Model.description.input.name
,
and ends with the layer
whose output name is equal to the value specified in
Model.description.output.name
.
Layers must have unique input and output names,
and a layer may not have input or output names that
refer to layers that are not yet defined.
CoreML supports sequential data that can be 1- or 3-dimensional.
3-dimensional data typically represents an image feature map,
whose shape is denoted by [C, H, W]
,
which corresponds to the channel, height, and width, respectively.
1-dimensional data is a set of features
whose shape is denoted by [C]
,
and is equivalent to 3-dimensional data
with the shape [C, 1, 1]
.
For the purposes of this specification,
batch dimension is ignored.
Thus, a sequence of 3-dimensional data
is to be understood as a 4-dimensional array,
whose shape is denoted by [Seq_length, C, H, W]
,
and a sequence of 1-dimensional data
is to be understood as a 2-dimensional array,
whose shape is denoted by [Seq_length, C]
,
which is equivalent to a 4-dimensional array
with the shape [Seq_length, C, 1, 1]
. This axes order is important to
remember while setting parameters for layers such as “reshape” and “permute”.
At runtime, all data blobs are internally represented
as 5-dimensional blobs
with the shape [Seq_length, Batch, C, H, W]
.
A layer may process input data differently if operating over a sequence; details of this behavior is documented in the layer’s message. Otherwise, sequential data is processed like a batch — that is, the sequence of inputs are processed independently and in parallel.
The network input shape specified by Model.description.input.type
must be compatible with the expected input shape
of the network input layer, i.e. the last dimension is the fastest moving one.
All data blobs, as well as weight parameters, are stored using row-major ordering, i.e. the last dimension is the fastest moving one.
NeuralNetwork¶
A neural network.
message NeuralNetwork {
repeated NeuralNetworkLayer layers = 1;
repeated NeuralNetworkPreprocessing preprocessing = 2;
}
NeuralNetworkImageScaler¶
A neural network preprocessor that performs a scalar multiplication of an image followed by addition of scalar biases to the channels.
- Input: X
- An image in BGR or RGB format with shape
[3, H, W]
or in grayscale format with shape[1, H, W]
. - Output: Y
- An image with format and shape corresponding to the input.
If the input image is in BGR format:
Y[0, :, :] = channelScale * X[0, :, :] + blueBias
Y[1, :, :] = channelScale * X[1, :, :] + greenBias
Y[2, :, :] = channelScale * X[2, :, :] + redBias
If the input image is in RGB format:
Y[0, :, :] = channelScale * X[0, :, :] + redBias
Y[1, :, :] = channelScale * X[1, :, :] + greenBias
Y[2, :, :] = channelScale * X[2, :, :] + blueBias
If the input image is in grayscale format:
Y[0, :, :] = channelScale * X[0, :, :] + grayBias
message NeuralNetworkImageScaler {
float channelScale = 10;
float blueBias = 20;
float greenBias = 21;
float redBias = 22;
float grayBias = 30;
}
NeuralNetworkMeanImage¶
A neural network preprocessor that
subtracts the provided mean image from the input image.
The mean image is subtracted from the input named
NeuralNetworkPreprocessing.featureName
.
message NeuralNetworkMeanImage {
repeated float meanImage = 1;
}
NeuralNetworkPreprocessing¶
Preprocessing parameters for image inputs.
message NeuralNetworkPreprocessing {
string featureName = 1;
oneof preprocessor {
NeuralNetworkImageScaler scaler = 10;
NeuralNetworkMeanImage meanImage = 11;
}
}
ActivationReLU¶
A rectified linear unit (ReLU) activation function.
This function has the following formula:
message ActivationReLU {
}
ActivationLeakyReLU¶
A leaky rectified linear unit (ReLU) activation function.
This function has the following formula:
message ActivationLeakyReLU {
float alpha = 1; //negative slope value for leakyReLU
}
ActivationTanh¶
A hyperbolic tangent activation function.
This function has the following formula:
message ActivationTanh {
}
ActivationScaledTanh¶
A scaled hyperbolic tangent activation function.
This function has the following formula:
message ActivationScaledTanh {
float alpha = 1;
float beta = 2;
}
ActivationSigmoid¶
A sigmoid activation function.
This function has the following formula:
message ActivationSigmoid {
}
ActivationLinear¶
A linear activation function.
This function has the following formula:
message ActivationLinear {
float alpha = 1;
float beta = 2;
}
ActivationSigmoidHard¶
A hard sigmoid activation function.
This function has the following formula:
message ActivationSigmoidHard {
float alpha = 1;
float beta = 2;
}
ActivationPReLU¶
A parameterized rectified linear unit (PReLU) activation function,
which takes [C]
or [C,H,W]
as an input and
applies different parameters in each channel dimension
(shared across the H
and W
components).
This function has the following formula:
message ActivationPReLU {
// parameter of length C or 1.
// If length is 1, same value is used for all channels
WeightParams alpha = 1;
}
ActivationELU¶
An exponential linear unit (ELU) activation function.
This function has the following formula:
message ActivationELU {
float alpha = 1;
}
ActivationThresholdedReLU¶
A thresholded rectified linear unit (ReLU) activation function.
This function has the following formula:
message ActivationThresholdedReLU {
float alpha = 1;
}
ActivationSoftsign¶
A softsign activation function.
This function has the following formula:
message ActivationSoftsign {
}
ActivationSoftplus¶
A softplus activation function.
This function has the following formula:
message ActivationSoftplus {
}
ActivationParametricSoftplus¶
A parametric softplus activation function,
which takes [C]
or [C,H,W]
as an input and
applies different parameters in each channel dimension
(shared across the H
and W
components).
This function has the following formula:
message ActivationParametricSoftplus {
// If length is 1, same value is used for all channels
WeightParams alpha = 1; //parameter of length C or 1
WeightParams beta = 2; //parameter of length C or 1
}
ActivationParams¶
message ActivationParams {
oneof NonlinearityType {
ActivationLinear linear = 5;
ActivationReLU ReLU = 10;
ActivationLeakyReLU leakyReLU = 15;
ActivationThresholdedReLU thresholdedReLU = 20;
ActivationPReLU PReLU = 25;
ActivationTanh tanh = 30;
ActivationScaledTanh scaledTanh = 31;
ActivationSigmoid sigmoid = 40;
ActivationSigmoidHard sigmoidHard = 41;
ActivationELU ELU = 50;
ActivationSoftsign softsign = 60;
ActivationSoftplus softplus = 70;
ActivationParametricSoftplus parametricSoftplus = 71;
}
}
NeuralNetworkLayer¶
A single neural network layer.
message NeuralNetworkLayer {
string name = 1; //descriptive name of the layer
repeated string input = 2;
repeated string output = 3;
oneof layer {
// start at 100 here
ConvolutionLayerParams convolution = 100;
PoolingLayerParams pooling = 120;
ActivationParams activation = 130;
InnerProductLayerParams innerProduct = 140;
EmbeddingLayerParams embedding = 150;
//normalization related layers
BatchnormLayerParams batchnorm = 160;
MeanVarianceNormalizeLayerParams mvn = 165;
L2NormalizeLayerParams l2normalize = 170;
SoftmaxLayerParams softmax = 175;
LRNLayerParams lrn = 180;
CropLayerParams crop = 190;
PaddingLayerParams padding = 200;
UpsampleLayerParams upsample = 210;
ResizeBilinearLayerParams resizeBilinear = 211;
CropResizeLayerParams cropResize = 212;
UnaryFunctionLayerParams unary = 220;
//elementwise operations
AddLayerParams add = 230;
MultiplyLayerParams multiply = 231;
AverageLayerParams average = 240;
ScaleLayerParams scale = 245;
BiasLayerParams bias = 250;
MaxLayerParams max = 260;
MinLayerParams min = 261;
DotProductLayerParams dot = 270;
ReduceLayerParams reduce = 280;
LoadConstantLayerParams loadConstant = 290;
//data reorganization
ReshapeLayerParams reshape = 300;
FlattenLayerParams flatten = 301;
PermuteLayerParams permute = 310;
ConcatLayerParams concat = 320;
SplitLayerParams split = 330;
SequenceRepeatLayerParams sequenceRepeat = 340;
ReorganizeDataLayerParams reorganizeData = 345;
SliceLayerParams slice = 350;
//Recurrent Layers
SimpleRecurrentLayerParams simpleRecurrent = 400;
GRULayerParams gru = 410;
UniDirectionalLSTMLayerParams uniDirectionalLSTM = 420;
BiDirectionalLSTMLayerParams biDirectionalLSTM = 430;
// Custom (user-implemented) Layer
CustomLayerParams custom = 500;
}
}
BorderAmounts¶
Specifies the amount of spatial border to be either padded or cropped.
For padding:
H_out = borderAmounts[0].startEdgeSize + H_in + borderAmounts[0].endEdgeSize
W_out = borderAmounts[1].startEdgeSize + W_in + borderAmounts[1].endEdgeSize
topPaddingAmount == Height startEdgeSize
bottomPaddingAmount == Height endEdgeSize
leftPaddingAmount == Width startEdgeSize
rightPaddingAmount == Width endEdgeSize
For cropping:
H_out = (-borderAmounts[0].startEdgeSize) + H_in + (-borderAmounts[0].endEdgeSize)
W_out = (-borderAmounts[1].startEdgeSize) + W_in + (-borderAmounts[1].endEdgeSize)
topCropAmount == Height startEdgeSize
bottomCropAmount == Height endEdgeSize
leftCropAmount == Width startEdgeSize
rightCropAmount == Width endEdgeSize
message BorderAmounts {
message EdgeSizes {
uint64 startEdgeSize = 1;
uint64 endEdgeSize = 2;
}
repeated EdgeSizes borderAmounts = 10;
}
BorderAmounts.EdgeSizes¶
message EdgeSizes {
uint64 startEdgeSize = 1;
uint64 endEdgeSize = 2;
}
ValidPadding¶
Specifies the type of padding to be used with Convolution/Deconvolution and Pooling layers.
After padding, input spatial shape: [H_in, W_in]
, gets modified to the
output spatial shape [H_out, W_out]
.
topPaddingAmount == Height startEdgeSize == borderAmounts[0].startEdgeSize
bottomPaddingAmount == Height endEdgeSize == borderAmounts[0].endEdgeSize
leftPaddingAmount == Width startEdgeSize == borderAmounts[1].startEdgeSize
rightPaddingAmount == Width endEdgeSize == borderAmounts[1].endEdgeSize
With Convolution or Pooling:
H_out = int_division_round_down((H_in + topPaddingAmount + bottomPaddingAmount - KernelSize[0]),stride[0]) + 1
which is same as:
H_out = int_division_round_up((H_in + topPaddingAmount + bottomPaddingAmount - KernelSize[0] + 1),stride[0])
With Deconvolution:
H_out = (H_in-1) * stride[0] + kernelSize[0] - (topPaddingAmount + bottomPaddingAmount)
The equivalent expressions hold true for W_out
as well.
By default, the values of paddingAmounts
are set to 0
,
which results in a “true” valid padding.
If non-zero values are provided for paddingAmounts
,
“valid” convolution/pooling is performed within the spatially expanded input.
message ValidPadding {
BorderAmounts paddingAmounts = 1;
}
SamePadding¶
Specifies the type of padding to be used with Convolution/Deconvolution and pooling layers.
After padding, input spatial shape: [H_in, W_in]
, gets modified to the
output spatial shape [H_out, W_out]
.
With Convolution or pooling:
H_out = int_division_round_up(H_in,stride[0])
W_out = int_division_round_up(W_in,stride[1])
This is achieved by using the following padding amounts:
totalPaddingHeight = max(0,(H_out-1) * stride[0] + KernelSize[0] - Hin)
totalPaddingWidth = max(0,(W_out-1) * stride[1] + KernelSize[1] - Win)
There are two modes of asymmetry:
BOTTOM_RIGHT_HEAVY
, and TOP_LEFT_HEAVY
.
If the mode is BOTTOM_RIGHT_HEAVY
:
topPaddingAmount = floor(totalPaddingHeight / 2)
bottomPaddingAmount = totalPaddingHeight - topPaddingAmount
leftPaddingAmount = floor(totalPaddingWidth / 2)
rightPaddingAmount = totalPaddingWidth - leftPaddingAmount
If the mode is TOP_LEFT_HEAVY
:
bottomPaddingAmount = floor(totalPaddingHeight / 2)
topPaddingAmount = totalPaddingHeight - bottomPaddingAmount
rightPaddingAmount = floor(totalPaddingWidth / 2)
leftPaddingAmount = totalPaddingWidth - rightPaddingAmount
With Deconvolution:
H_out = H_in * stride[0]
W_out = W_in * stride[1]
message SamePadding {
enum SamePaddingMode {
BOTTOM_RIGHT_HEAVY = 0;
TOP_LEFT_HEAVY = 1;
}
SamePaddingMode asymmetryMode = 1;
}
SamplingMode¶
Specifies how grid points are sampled from an interval.
Without the loss of generality, assume the interval to be [0, X-1] from which N points are to be sampled.
Here X may correspond to an input image’s height or width.
All the methods can be expressed in terms of numpy’s linspace function, along with the constraint that grid points have to lie in the interval [0, X-1].
Note: numpy.linspace(start = start, end = end, num = N, endpoint = True) corresponds to sampling
N points uniformly from the interval [start, end], endpoints included.
The methods vary in how the start
and end
values are computed.
message SamplingMode {
enum Method {
STRICT_ALIGN_ENDPOINTS_MODE = 0;
ALIGN_ENDPOINTS_MODE = 1;
UPSAMPLE_MODE = 2;
ROI_ALIGN_MODE = 3;
}
Method samplingMethod = 1;
}
BoxCoordinatesMode¶
Specifies the convention used to specify four bounding box coordinates for an image of size (Height, Width). The (0,0) coordinate corresponds to the top-left corner of the image.
message BoxCoordinatesMode {
enum Coordinates {
CORNERS_HEIGHT_FIRST = 0;
CORNERS_WIDTH_FIRST = 1;
CENTER_SIZE_HEIGHT_FIRST = 2;
CENTER_SIZE_WIDTH_FIRST = 3;
}
Coordinates boxMode = 1;
}
WeightParams¶
Weights for layer parameters. Weights are stored as repeated floating point numbers using row-major ordering and can represent 1-, 2-, 3-, or 4-dimensional data.
message WeightParams {
repeated float floatValue = 1;
bytes float16Value = 2;
bytes rawValue = 30;
QuantizationParams quantization = 40;
}
QuantizationParams¶
Quantization parameters.
message QuantizationParams {
uint64 numberOfBits = 1;
oneof QuantizationType {
LinearQuantizationParams linearQuantization = 101;
LookUpTableQuantizationParams lookupTableQuantization = 102;
}
}
LinearQuantizationParams¶
message LinearQuantizationParams {
repeated float scale = 1;
repeated float bias = 2;
}
LookUpTableQuantizationParams¶
message LookUpTableQuantizationParams {
repeated float floatValue = 1;
}
ConvolutionLayerParams¶
A layer that performs spatial convolution or deconvolution.
y = ConvolutionLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[inputChannels,inputHeight,inputWidth]
or[C_in, H_in, W_in]
. - Output
- A blob with shape
[outputChannels,outputHeight,outputWidth]
or[C_out, H_out, W_out]
.
If dilationFactor
is not 1, effective kernel size is
modified as follows:
KernelSize[0] <-- (kernelSize[0]-1) * dilationFactor[0] + 1
KernelSize[1] <-- (kernelSize[1]-1) * dilationFactor[1] + 1
Type of padding can be valid
or same
. Output spatial dimensions depend on the
the type of padding. For details, refer to the descriptions of the messages “ValidPadding”
and “SamePadding”. Padded values are all zeros.
For Deconvolution, ConvolutionPaddingType
(valid
or same
) is ignored when outputShape
is set.
message ConvolutionLayerParams {
uint64 outputChannels = 1;
uint64 kernelChannels = 2;
uint64 nGroups = 10;
repeated uint64 kernelSize = 20;
repeated uint64 stride = 30;
repeated uint64 dilationFactor = 40;
oneof ConvolutionPaddingType {
ValidPadding valid = 50;
SamePadding same = 51;
}
bool isDeconvolution = 60;
bool hasBias = 70;
WeightParams weights = 90;
WeightParams bias = 91;
repeated uint64 outputShape = 100;
}
InnerProductLayerParams¶
A layer that performs a matrix vector product. This is equivalent to a fully-connected, or dense layer.
y = InnerProductLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C_in]
or[C_in, 1, 1]
, whereC_in
is equal toinputChannels
. - Output
- A blob with shape
[C_out]
, whereC_out
is equal tooutputChannels
.
message InnerProductLayerParams {
uint64 inputChannels = 1;
uint64 outputChannels = 2;
bool hasBias = 10;
WeightParams weights = 20;
WeightParams bias = 21;
}
EmbeddingLayerParams¶
A layer that performs a matrix lookup and optionally adds a bias.
y = EmbeddingLayer(x)
Requires 1 input and produces 1 output.
- Input
- A sequence of integers with shape
[1]
or[1, 1, 1]
, (equivalent to[Seq_length, 1, 1, 1]
). Input values must be in the range[0, inputDim - 1]
. - Output
- A sequence of 1-dimensional features of size
outputChannels
(equivalent to[Seq_length, outputChannels, 1, 1]
).
message EmbeddingLayerParams {
uint64 inputDim = 1;
uint64 outputChannels = 2;
bool hasBias = 10;
WeightParams weights = 20;
WeightParams bias = 21;
}
BatchnormLayerParams¶
A layer that performs batch normalization, which is performed along the channel axis, and repeated along the other axes, if present.
y = BatchnormLayer(x)
Requires 1 input and produces 1 output.
This operation is described by the following formula:
- Input
- A blob with shape
[C]
or[C, H, W]
. - Output
- A blob with the same shape as the input.
message BatchnormLayerParams {
uint64 channels = 1;
bool computeMeanVar = 5;
bool instanceNormalization = 6;
float epsilon = 10;
WeightParams gamma=15;
WeightParams beta=16;
WeightParams mean=17;
WeightParams variance=18;
}
PoolingLayerParams¶
A spatial pooling layer.
y = PoolingLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C, H_in, W_in]
. - Output
- A blob with shape
[C, H_out, W_out]
.
Padding options are similar to ConvolutionLayerParams
with the additional option of ValidCompletePadding
(includeLastPixel
),
which ensures that the last application of the kernel
always includes the last pixel of the input image, if there is padding.
H_out = int_division_round_up((H_in + 2 * paddingAmounts[0] - kernelSize[0]),Stride[0]) + 1)
if (paddingAmounts[0] > 0 or paddingAmounts[1] > 0)
if ((H_out - 1) * Stride >= H_in + paddingAmounts[0]) {
H_out = H_out - 1
}
}
The equivalent expressions hold true for W_out
as well.
Only symmetric padding is supported with this option.
message PoolingLayerParams {
enum PoolingType{
MAX = 0;
AVERAGE = 1;
L2 = 2;
}
PoolingType type = 1;
repeated uint64 kernelSize = 10;
repeated uint64 stride = 20;
message ValidCompletePadding {
repeated uint64 paddingAmounts = 10;
}
oneof PoolingPaddingType {
ValidPadding valid = 30;
SamePadding same = 31;
ValidCompletePadding includeLastPixel = 32;
}
bool avgPoolExcludePadding = 50;
bool globalPooling = 60;
}
PoolingLayerParams.ValidCompletePadding¶
message ValidCompletePadding {
repeated uint64 paddingAmounts = 10;
}
PaddingLayerParams¶
A layer that performs padding along spatial dimensions.
y = PaddingLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C, H_in, W_in]
. - Output
- A blob with shape
[C, H_out, W_out]
.
Output dimensions are calculated as follows:
H_out = H_in + topPaddingAmount + bottomPaddingAmount
W_out = W_in + leftPaddingAmount + rightPaddingAmount
topPaddingAmount == Height startEdgeSize == borderAmounts[0].startEdgeSize
bottomPaddingAmount == Height endEdgeSize == borderAmounts[0].endEdgeSize
leftPaddingAmount == Width startEdgeSize == borderAmounts[1].startEdgeSize
rightPaddingAmount == Width endEdgeSize == borderAmounts[1].endEdgeSize
There are three types of padding:
PaddingConstant
, which fills a constant value at the border.PaddingReflection
, which reflects the values at the border.PaddingReplication
, which replicates the values at the border.
Given the following input:
[1, 3, 4] : 1 2 3 4
5 6 7 8
9 10 11 12
Here is the output of applying the padding
(top=2, left=2, bottom=0, right=0)
with each of the supported types:
PaddingConstant
(value = 0
): .. code:[1, 5, 6] : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 0 0 5 6 7 8 0 0 9 10 11 12
PaddingReflection
: .. code:[1, 5, 6] : 11 10 9 10 11 12 7 6 5 6 7 8 3 2 1 2 3 4 7 6 5 6 7 8 11 10 9 10 11 12
PaddingReplication
: .. code:[1, 5, 6] : 1 1 1 2 3 4 1 1 1 2 3 4 1 1 1 2 3 4 5 5 5 6 7 8 9 9 9 10 11 12
message PaddingLayerParams {
message PaddingConstant {
float value = 1;
}
message PaddingReflection {
}
message PaddingReplication {
}
oneof PaddingType {
PaddingConstant constant = 1;
PaddingReflection reflection = 2;
PaddingReplication replication = 3;
}
BorderAmounts paddingAmounts = 10;
}
PaddingLayerParams.PaddingConstant¶
Fill a constant value in the padded region.
message PaddingConstant {
float value = 1;
}
PaddingLayerParams.PaddingReflection¶
Reflect the values at the border for padding.
message PaddingReflection {
}
PaddingLayerParams.PaddingReplication¶
Replicate the values at the border for padding.
message PaddingReplication {
}
ConcatLayerParams¶
A layer that concatenates along the channel axis (default) or sequence axis.
y = ConcatLayer(x1,x2,....)
Requires more than 1 input and produces 1 output.
The input and output formats are dependent on sequenceConcat
.
If sequenceConcat == true
:
- Input
- Sequences of length
Seq_i
of blobs with shape[C, H, W]
. - Output
- A Sequence of length
summation(Seq_i)
of blobs with shape[C, H, W]
.
If sequenceConcat == false
:
- Input
- A blob with shape
[C_i, H, W]
, wherei = 1, 2, ...
. - Output
- A blob with shape
[summation(C_i), H, W]
.
message ConcatLayerParams {
bool sequenceConcat = 100;
}
LRNLayerParams¶
A layer that performs local response normalization (LRN).
y = LRNLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C, H, W]
- Output
- A blob with the same shape as the input.
This layer is described by the following formula:
where the summation is done over a (localSize, 1, 1)
neighborhood —
that is, over a window “across” channels in 1x1 spatial neighborhoods.
message LRNLayerParams {
float alpha = 1;
float beta = 2;
uint64 localSize = 3;
float k = 4;
}
SoftmaxLayerParams¶
Softmax Normalization Layer
A layer that performs softmax normalization. Normalization is done along the channel axis.
y = SoftmaxLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C]
or[C, H, W]
. - Output
- A blob with the same shape as the input.
This layer is described by the following formula:
message SoftmaxLayerParams {
}
SplitLayerParams¶
A layer that uniformly splits across the channel dimension to produce a specified number of outputs.
(y1,y2,...yN) = SplitLayer(x), where N = nOutputs
Requires 1 input and produces multiple outputs.
- Input
- A blob with shape
[C]
or[C, H, W]
- Output
nOutputs
blobs with shapes[C/nOutputs]
or[C/nOutputs, H, W]
message SplitLayerParams {
uint64 nOutputs=1;
}
AddLayerParams¶
A layer that performs elementwise addition.
y = AddLayer(x1,x2,...)
Requires 1 or more than 1 input and produces 1 output.
- Input
- One or more blobs with broadcastable shapes
[1]
,[C]
,[1, H, W]
, or[C, H, W]
. - Output
- A blob with shape equal to the input blob.
If only one input is provided, scalar addition is performed:
message AddLayerParams {
float alpha = 1;
}
MultiplyLayerParams¶
A layer that performs elementwise multiplication.
y = MultiplyLayer(x1,x2,...)
Requires 1 or more than 1 input and produces 1 output.
- Input
- One or more blobs with broadcastable shapes
[1]
,[C]
,[1, H, W]
, or[C, H, W]
. - Output
- A blob with shape equal to the first input blob.
If only one input is provided, scalar multiplication is performed:
message MultiplyLayerParams {
float alpha = 1;
}
UnaryFunctionLayerParams¶
A layer that applies a unary function.
y = UnaryFunctionLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C]
or[C, H, W]
. - Output
- A blob with the same shape as the input.
The input is first modified by shifting and scaling:
message UnaryFunctionLayerParams {
enum Operation{
SQRT = 0;
RSQRT = 1;
INVERSE = 2;
POWER = 3;
EXP = 4;
LOG = 5;
ABS = 6;
THRESHOLD = 7;
}
Operation type = 1;
float alpha = 2;
float epsilon = 3;
float shift = 4;
float scale = 5;
}
UpsampleLayerParams¶
A layer that scales up spatial dimensions. It supports two modes: nearest neighbour (default) and bilinear.
y = UpsampleLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C, H, W]
. - Output
- A blob with shape
[C, scalingFactor[0] * H, scalingFactor[1] * W]
message UpsampleLayerParams {
repeated uint64 scalingFactor = 1;
enum InterpolationMode {
NN = 0;
BILINEAR = 1;
}
InterpolationMode mode = 5;
}
ResizeBilinearLayerParams¶
A layer that resizes the input to a pre-specified spatial size using bilinear interpolation.
y = ResizeBilinearLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C, H_in, W_in]
. - Output
- A blob with shape
[C, H_out, W_out]
.
message ResizeBilinearLayerParams {
repeated uint64 targetSize = 1;
SamplingMode mode = 2;
}
CropResizeLayerParams¶
A layer that extracts cropped spatial patches or RoIs (regions of interest) from the input and resizes them to a pre-specified size using bilinear interpolation. Note that RoI Align layer can be implemented with this layer followed by a pooling layer.
y = CropResizeLayer(x)
Requires 2 inputs and produces 1 output.
- Input
- A blob with shape
[C, H_in, W_in]
. - This represents an image feature map.
Note that the full batched input shape is
[Seq = 1, Batch = Batch, C, H_in, W_in]
- A second input with shape
[N, 4, 1, 1]
or[N, 5, 1, 1]
. - This represents the bounding box coordinates for N patches/RoIs.
Note that the full batched shape is
[Seq = N, Batch = 1, C = 4 or 5, 1, 1]
That is, sequence dimension = N, channel dimension = 4 or 5.
N: number of patches/RoIs to be extracted If RoI shape =
[N, 4, 1, 1]
The channel axis corresponds to the four coordinates specifying the bounding box. All the N RoIs are extracted from all the batches of the input.- If RoI shape =
[N, 5, 1, 1]
- The first element of the channel axis specifies the input batch id from which to extract the RoI and must be in the interval
[0, Batch - 1]
. That is, n-th RoI is extracted from the RoI[n,0,0,0]-th input batch id. The last four elements of the channel axis specify the bounding box coordinates.
- A blob with shape
- Output
- A blob with shape
[N, C, H_out, W_out]
. - This represents the output image feature map for each input RoI.
Note that the full batched output shape is:
-
[Seq = N, Batch = Batch, C, H_out, W_out]
if input RoI shape is[N, 4, 1, 1]
-[Seq = N, Batch = 1, C, H_out, W_out]
if input RoI shape is[N, 5, 1, 1]
- A blob with shape
message CropResizeLayerParams {
repeated uint64 targetSize = 1;
bool normalizedCoordinates = 2;
SamplingMode mode = 3;
BoxCoordinatesMode boxIndicesMode = 4;
float spatialScale = 5;
}
BiasLayerParams¶
A layer that performs elementwise addition of a bias, which is broadcasted to match the input shape.
y = BiasLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C, H, W]
. - Output
- A blob with the same shape as the input.
message BiasLayerParams {
repeated uint64 shape = 1;
WeightParams bias = 2;
}
ScaleLayerParams¶
A layer that performs elmentwise multiplication by a scale factor and optionally adds a bias; both the scale and bias are broadcasted to match the input shape.
y = ScaleLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C, H, W]
. - Output
- A blob with the same shape as the input.
message ScaleLayerParams {
repeated uint64 shapeScale = 1;
WeightParams scale = 2;
bool hasBias = 3;
repeated uint64 shapeBias = 4;
WeightParams bias = 5;
}
LoadConstantLayerParams¶
A layer that loads data as a parameter and provides it as an output.
y = LoadConstantLayer()
Takes no input. Produces 1 output.
- Input
- None
- Output:
- A blob with shape
[C, H, W]
message LoadConstantLayerParams {
repeated uint64 shape = 1;
WeightParams data = 2;
}
L2NormalizeLayerParams¶
A layer that performs L2 normalization, i.e. divides by the the square root of the sum of squares of all elements of input.
y = L2NormalizeLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C]
or[C, H, W]
. - Output
- A blob with the same shape as the input.
This layer is described by the following formula:
message L2NormalizeLayerParams {
float epsilon = 1;
}
FlattenLayerParams¶
A layer that flattens the input.
y = FlattenLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C, H, W]
. - Output
- A blob with shape
[C * H * W, 1, 1]
There are two flatten orders: CHANNEL_FIRST
and CHANNEL_LAST
.
CHANNEL_FIRST
does not require data to be rearranged,
because row major ordering is used by internal storage.
CHANNEL_LAST
requires data to be rearranged.
message FlattenLayerParams {
enum FlattenOrder {
CHANNEL_FIRST = 0;
CHANNEL_LAST = 1;
}
FlattenOrder mode = 1;
}
ReshapeLayerParams¶
A layer that recasts the input into a new shape.
y = ReshapeLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C, H, W]
or[Seq, C, H, W]
. - Output
- A blob with shape
[C_out, H_out, W_out]
or[Seq_out, C_out, H_out, W_out]
.
There are two reshape orders: CHANNEL_FIRST
and CHANNEL_LAST
.
CHANNEL_FIRST
is equivalent to
flattening the input to [C * H * W, 1, 1]
in channel first order
and then reshaping it to the target shape;
no data rearrangement is required.
CHANNEL_LAST
is equivalent to
flattening the input to [H * W * C, 1, 1]
in channel last order,
reshaping it to [H_out, W_out, C_out]
(it is now in “H_out-major”” order),
and then permuting it to [C_out, H_out, W_out]
;
both the flattening and permuting requires the data to be rearranged.
message ReshapeLayerParams {
repeated int64 targetShape = 1;
enum ReshapeOrder {
CHANNEL_FIRST = 0;
CHANNEL_LAST = 1;
}
ReshapeOrder mode = 2;
}
PermuteLayerParams¶
A layer that rearranges the dimensions and data of an input.
y = PermuteLayer(x)
Requires 1 input and produces 1 output.
- Input
- A sequence of 3-dimensional blobs.
InputShape = [Seq, C, H, W]
. - Output
- A sequence of a different length of 3-dimensional blobs.
Shape:
[InputShape[axis[0]], InputShape[axis[1]], InputShape[axis[2]], InputShape[axis[3]]]
. Hence output is a sequence of lengthInputShape[axis[0]]
.
Examples:
- If
axis
is set to[0, 3, 1, 2]
, then the output has shape[W,C,H]
and has the same sequence length that of the input. - If
axis
is set to[3, 1, 2, 0]
, and the input is a sequence of data with lengthSeq
and shape[C, 1, 1]
, then the output is a unit sequence of data with shape[C, 1, Seq]
. - If
axis
is set to[0, 3, 2, 1]
, the output is a reverse of the input:[C, H, W] -> [W, H, C]
. - If
axis
is not set, or is set to[0, 1, 2, 3]
, the output is the same as the input.
message PermuteLayerParams {
repeated uint64 axis = 1;
}
ReorganizeDataLayerParams¶
A layer that reorganizes data in the input in specific ways.
y = ReorganizeDataLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C, H, W]
. - Output
- A blob with shape
[C_out, H_out, W_out]
. - mode == SPACE_TO_DEPTH
[C_out, H_out, W_out]
:[C * blockSize * blockSize, H/blockSize, W/blockSize]
. blockSize must divide H and W. Data is moved from the spatial dimensions to the channel dimension. Input is spatially divided into non-overlapping blocks of size blockSize X blockSize and data from each block is moved into the channel dimension.- mode == DEPTH_TO_SPACE
[C_out, H_out, W_out]
:[C/(blockSize * blockSize), H * blockSize, W * blockSize]
. Square of blockSize must divide C. Reverse of SPACE_TO_DEPTH. Data is moved from the channel dimension to the spatial dimensions.
message ReorganizeDataLayerParams {
enum ReorganizationType {
SPACE_TO_DEPTH = 0;
DEPTH_TO_SPACE = 1;
}
ReorganizationType mode = 1;
uint64 blockSize = 2;
}
SliceLayerParams¶
A layer that slices the input data along a given axis.
y = SliceLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[Seq, C, H, W]
. - Output
- A blob with shape
[Seq_out, C_out, H_out, W_out]
.
Sliced section is taken from the interval [startIndex, endIndex)
, i.e.
startIndex is inclusive while endIndex is exclusive.
stride must be positive and represents the step size for slicing.
Negative indexing is supported for startIndex and endIndex.
-1 denotes N-1, -2 denotes N-2 and so on, where N is the length of the dimension to be sliced.
message SliceLayerParams {
int64 startIndex = 1;
int64 endIndex = 2;
uint64 stride = 3;
enum SliceAxis {
CHANNEL_AXIS = 0;
HEIGHT_AXIS = 1;
WIDTH_AXIS = 2;
}
SliceAxis axis = 4;
}
ReduceLayerParams¶
A layer that reduces the input using a specified operation.
y = ReduceLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C, H, W]
. - Output
- A blob whose shape depends on the value of axis, the dimension(s) along which reduction is performed.
if axis == C :
[1, H, W]
if axis == H :[C, 1, W]
if axis == W :[C, H, 1]
if axis == HW :[C, 1, 1]
if axis == CHW :[1, 1, 1]
[Default]
message ReduceLayerParams {
enum ReduceOperation {
SUM = 0;
AVG = 1;
PROD = 2;
LOGSUM = 3;
SUMSQUARE = 4;
L1 = 5;
L2 = 6;
MAX = 7;
MIN = 8;
ARGMAX = 9;
}
ReduceOperation mode = 1;
float epsilon = 2;
enum ReduceAxis {
CHW = 0;
HW = 1;
C = 2;
H = 3;
W = 4;
}
ReduceAxis axis = 3;
}
CropLayerParams¶
A layer that crops the spatial dimensions of an input. If two inputs are provided, the shape of the second input is used as the reference shape.
y = CropLayer(x1) or y = CropLayer(x1,x2)
Requires 1 or 2 inputs and produces 1 output.
- Input
- 1 input case: A blob with shape
[C, H_in, W_in]
. - 2 input case: 1st blob with shape
[C, H_in, W_in]
, 2nd blob with shape[C, H_out, W_out]
.
- 1 input case: A blob with shape
- Output
- A blob with shape
[C, H_out, W_out]
.
If one input is used, output is computed as follows:
y = x1[:, topCropAmount:H_in - bottomCropAmount, leftCropAmount:W_in - rightCropAmount]
topCropAmount == Height startEdgeSize == borderAmounts[0].startEdgeSize
bottomCropAmount == Height endEdgeSize == borderAmounts[0].endEdgeSize
leftCropAmount == Width startEdgeSize == borderAmounts[1].startEdgeSize
rightCropAmount == Width endEdgeSize == borderAmounts[1].endEdgeSize
H_out = H_in - topCropAmount - bottomCropAmount
W_out = W_in - leftCropAmount - rightCropAmount
If two inputs are used, output is computed as follows:
y = x1[:, offset[0]:offset[0] + H_out, offset[1]:offset[1] + W_out]
message CropLayerParams {
BorderAmounts cropAmounts = 1;
repeated uint64 offset = 5;
}
AverageLayerParams¶
A layer that computes the elementwise average of the inputs.
y = AverageLayer(x1,x2,...)
Requires multiple inputs and produces 1 output.
- Input
- Multiple blobs with broadcastable shapes
[1]
,[C]
,[1, H, W]
, or[C, H, W]
. - Output
- A blob with the same shape as each input.
message AverageLayerParams {
}
MaxLayerParams¶
A layer that computes the elementwise maximum over the inputs.
y = MaxLayer(x1,x2,...)
Requires multiple inputs and produces 1 output.
- Input
- Multiple blobs, each with shape
[C]
or[C, H, W]
. - Output
- A blob with the same shape as each input.
message MaxLayerParams {
}
MinLayerParams¶
A layer that computes the elementwise minimum over the inputs.
y = MinLayer(x1,x2,...)
Requires multiple inputs and produces 1 output.
- Input
- Multiple blobs, each with shape
[C]
or[C, H, W]
. - Output
- A blob with the same shape as each input.
message MinLayerParams {
}
DotProductLayerParams¶
A layer that computes the dot product of two vectors.
y = DotProductLayer(x1,x2)
Requires 2 inputs and produces 1 output.
- Input
- Two blobs with shape
[C]
. - Output
- A scalar.
message DotProductLayerParams {
bool cosineSimilarity = 1;
}
MeanVarianceNormalizeLayerParams¶
A layer that performs mean variance normalization.
y = MeanVarianceNormalizeLayer(x)
Requires 1 input and produces 1 output.
- Input
- A blob with shape
[C]
or[C, H, W]
. - Output
- A blob with the same shape as the input.
If acrossChannels == true
normalization is performed on flattened input.
If acrossChannels == false
normalization is performed within a channel,
across spatial dimensions.
message MeanVarianceNormalizeLayerParams {
bool acrossChannels = 1;
bool normalizeVariance = 2;
float epsilon = 3;
}
SequenceRepeatLayerParams¶
A layer that repeats a sequence.
y = SequenceRepeatLayer(x)
Requires 1 input and produces 1 output.
- Input
- A sequence of blobs, i.e. shape is either
[Seq, C]
or[Seq, C, H, W]
. - Output
- A sequence of length
nRepetitions * Seq
with shape corresponding to the input, i.e. shape is either[nRepetitions * Seq, C]
or[nRepetitions * Seq, C, H, W]
.
message SequenceRepeatLayerParams {
uint64 nRepetitions = 1;
}
SimpleRecurrentLayerParams¶
A simple recurrent layer.
y_t = SimpleRecurrentLayer(x_t, y_{t-1})
- Input
- A sequence of vectors of size
inputVectorSize
with shape[Seq, inputVectorSize]
. - Output
- A vector of size
outputVectorSize
. It is either the final output or a sequence of outputs at all time steps.
- Output Shape:
[1,outputVectorSize]
, ifsequenceOutput == false
- Output Shape:
[Seq,outputVectorSize]
, ifsequenceOutput == true
This layer is described by the following equation:
W
is a 2-dimensional weight matrix ([outputVectorSize, inputVectorSize]
, row-major)R
is a 2-dimensional recursion matrix ([outputVectorSize, outputVectorSize]
, row-major)b
is a 1-dimensional bias vector ([outputVectorSize]
)f()
is an activationclip()
is a function that constrains values between[-50.0, 50.0]
message SimpleRecurrentLayerParams {
uint64 inputVectorSize = 1;
uint64 outputVectorSize = 2;
ActivationParams activation = 10;
If false output is just the result after final state update.
If true, output is a sequence, containing outputs at all time steps.
bool sequenceOutput = 15;
bool hasBiasVector = 20;
WeightParams weightMatrix = 30;
WeightParams recursionMatrix = 31;
WeightParams biasVector = 32;
bool reverseInput = 100;
// If true, then the node processes the input sequence from right to left
}
GRULayerParams¶
Gated-Recurrent Unit (GRU) Layer
y_t = GRULayer(x_t, y_{t-1})
- Input
- A sequence of vectors of size
inputVectorSize
with shape[Seq, inputVectorSize]
. - Output
- A vector of size
outputVectorSize
. It is either the final output or a sequence of outputs at all time steps.
- Output Shape:
[1,outputVectorSize]
, ifsequenceOutput == false
- Output Shape:
[Seq,outputVectorSize]
, ifsequenceOutput == true
This layer is described by the following equations:
- Update Gate
- Reset Gate
- Cell Memory State
- Output Gate
- Output
W_z
,W_r
,W_o
are 2-dimensional input weight matrices ([outputVectorSize, inputVectorSize]
, row-major)R_z
,R_r
,R_o
are 2-dimensional recursion matrices ([outputVectorSize, outputVectorSize]
, row-major)b_z
,b_r
,b_o
are 1-dimensional bias vectors ([outputVectorSize]
)f()
,g()
are activationsclip()
is a function that constrains values between[-50.0, 50.0]
⊙
denotes the elementwise product of matrices
message GRULayerParams {
uint64 inputVectorSize = 1;
uint64 outputVectorSize = 2;
repeated ActivationParams activations = 10;
bool sequenceOutput = 15;
bool hasBiasVectors = 20;
WeightParams updateGateWeightMatrix = 30;
WeightParams resetGateWeightMatrix = 31;
WeightParams outputGateWeightMatrix = 32;
WeightParams updateGateRecursionMatrix = 50;
WeightParams resetGateRecursionMatrix = 51;
WeightParams outputGateRecursionMatrix = 52;
WeightParams updateGateBiasVector = 70;
WeightParams resetGateBiasVector = 71;
WeightParams outputGateBiasVector = 72;
bool reverseInput = 100;
}
LSTMParams¶
Long short-term memory (LSTM) parameters.
This is described by the following equations:
- Input Gate
- Forget Gate
- Block Input
- Cell Memory State
- Output Gate
- Output
W_i
,W_f
,W_z
,W_o
are 2-dimensional input weight matrices ([outputVectorSize, inputVectorSize]
, row-major)R_i
,R_f
,R_z
,R_o
are 2-dimensional recursion matrices ([outputVectorSize, outputVectorSize]
, row-major)b_i
,b_f
,b_z
,b_o
are 1-dimensional bias vectors ([outputVectorSize]
)p_
,p_f
,p_o
are 1-dimensional peephole vectors ([outputVectorSize]
)f()
,g()
,h()
are activationsclip()
is a function that constrains values between[-50.0, 50.0]
⊙
denotes the elementwise product of matrices
message LSTMParams {
bool sequenceOutput = 10;
bool hasBiasVectors = 20;
bool forgetBias = 30;
bool hasPeepholeVectors = 40;
bool coupledInputAndForgetGate = 50;
float cellClipThreshold = 60;
}
LSTMWeightParams¶
Weights for long short-term memory (LSTM) layers
message LSTMWeightParams {
WeightParams inputGateWeightMatrix = 1;
WeightParams forgetGateWeightMatrix = 2;
WeightParams blockInputWeightMatrix = 3;
WeightParams outputGateWeightMatrix = 4;
WeightParams inputGateRecursionMatrix = 20;
WeightParams forgetGateRecursionMatrix = 21;
WeightParams blockInputRecursionMatrix = 22;
WeightParams outputGateRecursionMatrix = 23;
//biases:
WeightParams inputGateBiasVector = 40;
WeightParams forgetGateBiasVector = 41;
WeightParams blockInputBiasVector = 42;
WeightParams outputGateBiasVector = 43;
//peepholes:
WeightParams inputGatePeepholeVector = 60;
WeightParams forgetGatePeepholeVector = 61;
WeightParams outputGatePeepholeVector = 62;
}
UniDirectionalLSTMLayerParams¶
A unidirectional long short-term memory (LSTM) layer.
(y_t, c_t) = UniDirectionalLSTMLayer(x_t, y_{t-1}, c_{t-1})
- Input
- A sequence of vectors of size
inputVectorSize
with shape[Seq, inputVectorSize]
. - Output
- A vector of size
outputVectorSize
. It is either the final output or a sequence of outputs at all time steps.
- Output Shape:
[1,outputVectorSize]
, ifsequenceOutput == false
- Output Shape:
[Seq,outputVectorSize]
, ifsequenceOutput == true
message UniDirectionalLSTMLayerParams {
uint64 inputVectorSize = 1;
uint64 outputVectorSize = 2;
repeated ActivationParams activations = 10;
LSTMParams params = 15;
LSTMWeightParams weightParams = 20;
bool reverseInput = 100;
}
BiDirectionalLSTMLayerParams¶
Bidirectional long short-term memory (LSTM) layer
(y_t, c_t, y_t_reverse, c_t_reverse) = BiDirectionalLSTMLayer(x_t, y_{t-1}, c_{t-1}, y_{t-1}_reverse, c_{t-1}_reverse)
- Input
- A sequence of vectors of size
inputVectorSize
with shape[Seq, inputVectorSize]
. - Output
- A vector of size
2 * outputVectorSize
. It is either the final output or a sequence of outputs at all time steps.
- Output Shape:
[1, 2 * outputVectorSize]
, ifsequenceOutput == false
- Output Shape:
[Seq, 2 * outputVectorSize]
, ifsequenceOutput == true
The first LSTM operates on the input sequence in the forward direction. The second LSTM operates on the input sequence in the reverse direction.
Example: given the input sequence [x_1, x_2, x_3]
,
where x_i
are vectors at time index i
:
The forward LSTM output is [yf_1, yf_2, yf_3]
,
where yf_i
are vectors of size outputVectorSize
:
yf_1
is the output at the end of sequence {x_1
}yf_2
is the output at the end of sequence {x_1
,x_2
}yf_3
is the output at the end of sequence {x_1
,x_2
,x_3
}
The backward LSTM output: [yb_1, yb_2, yb_3]
,
where yb_i
are vectors of size outputVectorSize
:
yb_1
is the output at the end of sequence {x_3
}yb_2
is the output at the end of sequence {x_3
,x_2
}yb_3
is the output at the end of sequence {x_3
,x_2
,x_1
}
Output of the bi-dir layer:
- if
sequenceOutput = True
: {[yf_1, yb_3]
,[yf_2, yb_2]
,[yf_3, yb_1]
} - if
sequenceOutput = False
: {[yf_3, yb_3]
}
message BiDirectionalLSTMLayerParams {
uint64 inputVectorSize = 1;
uint64 outputVectorSize = 2;
repeated ActivationParams activationsForwardLSTM = 10;
repeated ActivationParams activationsBackwardLSTM = 11;
LSTMParams params = 15;
repeated LSTMWeightParams weightParams = 20;
}
CustomLayerParams¶
message CustomLayerParams {
message CustomLayerParamValue {
oneof value {
double doubleValue = 10;
string stringValue = 20;
int32 intValue = 30;
int64 longValue = 40;
bool boolValue = 50;
}
}
string className = 10; // The name of the class (conforming to MLCustomLayer) corresponding to this layer
repeated WeightParams weights = 20; // Any weights -- these are serialized in binary format and memmapped at runtime
map<string, CustomLayerParamValue> parameters = 30; // these may be handled as strings, so this should not be large
string description = 40; // An (optional) description of the layer provided by the model creator. This information is displayed when viewing the model, but does not affect the model's execution on device.
}
CustomLayerParams.CustomLayerParamValue¶
message CustomLayerParamValue {
oneof value {
double doubleValue = 10;
string stringValue = 20;
int32 intValue = 30;
int64 longValue = 40;
bool boolValue = 50;
}
}
CustomLayerParams.ParametersEntry¶
message CustomLayerParamValue {
oneof value {
double doubleValue = 10;
string stringValue = 20;
int32 intValue = 30;
int64 longValue = 40;
bool boolValue = 50;
}
}
NeuralNetworkClassifier¶
A neural network specialized as a classifier.
message NeuralNetworkClassifier {
repeated NeuralNetworkLayer layers = 1;
repeated NeuralNetworkPreprocessing preprocessing = 2;
oneof ClassLabels {
StringVector stringClassLabels = 100;
Int64Vector int64ClassLabels = 101;
}
string labelProbabilityLayerName = 200;
}
NeuralNetworkRegressor¶
A neural network specialized as a regressor.
message NeuralNetworkRegressor {
repeated NeuralNetworkLayer layers = 1;
repeated NeuralNetworkPreprocessing preprocessing = 2;
}
BoxCoordinatesMode.Coordinates¶
enum Coordinates {
CORNERS_HEIGHT_FIRST = 0;
CORNERS_WIDTH_FIRST = 1;
CENTER_SIZE_HEIGHT_FIRST = 2;
CENTER_SIZE_WIDTH_FIRST = 3;
}
FlattenLayerParams.FlattenOrder¶
enum FlattenOrder {
CHANNEL_FIRST = 0;
CHANNEL_LAST = 1;
}
PoolingLayerParams.PoolingType¶
enum PoolingType{
MAX = 0;
AVERAGE = 1;
L2 = 2;
}
ReduceLayerParams.ReduceAxis¶
enum ReduceAxis {
CHW = 0;
HW = 1;
C = 2;
H = 3;
W = 4;
}
ReduceLayerParams.ReduceOperation¶
enum ReduceOperation {
SUM = 0;
AVG = 1;
PROD = 2;
LOGSUM = 3;
SUMSQUARE = 4;
L1 = 5;
L2 = 6;
MAX = 7;
MIN = 8;
ARGMAX = 9;
}
ReorganizeDataLayerParams.ReorganizationType¶
enum ReorganizationType {
SPACE_TO_DEPTH = 0;
DEPTH_TO_SPACE = 1;
}
ReshapeLayerParams.ReshapeOrder¶
enum ReshapeOrder {
CHANNEL_FIRST = 0;
CHANNEL_LAST = 1;
}
SamePadding.SamePaddingMode¶
enum SamePaddingMode {
BOTTOM_RIGHT_HEAVY = 0;
TOP_LEFT_HEAVY = 1;
}
SamplingMode.Method¶
enum Method {
STRICT_ALIGN_ENDPOINTS_MODE = 0;
ALIGN_ENDPOINTS_MODE = 1;
UPSAMPLE_MODE = 2;
ROI_ALIGN_MODE = 3;
}
SliceLayerParams.SliceAxis¶
enum SliceAxis {
CHANNEL_AXIS = 0;
HEIGHT_AXIS = 1;
WIDTH_AXIS = 2;
}
UnaryFunctionLayerParams.Operation¶
A unary operator.
The following functions are supported:
SQRT
RSQRT
INVERSE
POWER
EXP
LOG
ABS
THRESHOLD
enum Operation{
SQRT = 0;
RSQRT = 1;
INVERSE = 2;
POWER = 3;
EXP = 4;
LOG = 5;
ABS = 6;
THRESHOLD = 7;
}
UpsampleLayerParams.InterpolationMode¶
enum InterpolationMode {
NN = 0;
BILINEAR = 1;
}