layer_norm¶
Normalizes over the last D dimensions (specified via axes); mean and variance are computed across those axes for each remaining slice.
ATen source: aten.native_layer_norm
Inputs¶
Name |
Shape |
Description |
|---|---|---|
|
|
Tensor to normalize |
|
matches normalized dims |
Scale applied to the normalized tensor |
|
matches normalized dims |
Shift added after the scale |
Attributes¶
Name |
Type |
Description |
|---|---|---|
|
|
Dimensions over which mean/variance are computed (the trailing |
|
|
Numerical-stability epsilon |
|
|
Composite op version |
Output¶
Name |
Shape |
Description |
|---|---|---|
|
|
Same shape as |
Data types¶
fp16, fp32, bf16.
PyTorch example¶
import torch
from torch.nn.functional import layer_norm
N, C, H, W = 20, 5, 10, 10
input = torch.randn(N, C, H, W)
# Normalize over the last three dims (C, H, W)
output = layer_norm(input, normalized_shape=[C, H, W], weight=None, bias=None, eps=1e-5)