zuko.nn¶

Neural networks, layers and modules.

Classes¶

`MLP`	Creates a multi-layer perceptron (MLP).
`Linear`	Creates a linear layer.
`MaskedMLP`	Creates a masked multi-layer perceptron (MaskedMLP).
`MonotonicMLP`	Creates a monotonic multi-layer perceptron (MonotonicMLP).

Descriptions¶

class zuko.nn.MLP(in_features, out_features, hidden_features=(64, 64), activation=None, normalize=False, **kwargs)¶[source]

Creates a multi-layer perceptron (MLP).

Also known as fully connected feedforward network, an MLP is a sequence of non-linear parametric functions

\[h_{i + 1} = a_{i + 1}(h_i W_{i + 1}^T + b_{i + 1}),\]

over feature vectors \(h_i\), with the input and output feature vectors \(x = h_0\) and \(y = h_L\), respectively. The non-linear functions \(a_i\) are called activation functions. The trainable parameters of an MLP are its weights and biases \(\phi = \{W_i, b_i | i = 1, \dots, L\}\).

Wikipedia

https://wikipedia.org/wiki/Feedforward_neural_network

Parameters:

in_features (int) – The number of input features.
out_features (int) – The number of output features.
hidden_features (Sequence[int]) – The numbers of hidden features.
activation (Callable[[], Module] | None) – The activation function constructor. If None, use torch.nn.ReLU instead.
normalize (bool) – Whether features are normalized between layers or not.
kwargs – Keyword arguments passed to Linear.

Example

>>> net = MLP(64, 1, [32, 16], activation=nn.ELU)
>>> net
MLP(
  (0): Linear(in_features=64, out_features=32, bias=True)
  (1): ELU(alpha=1.0)
  (2): Linear(in_features=32, out_features=16, bias=True)
  (3): ELU(alpha=1.0)
  (4): Linear(in_features=16, out_features=1, bias=True)
)

class zuko.nn.Linear(in_features, out_features, bias=True, stack=None)¶[source]

Creates a linear layer.

\[y = x W^T + b\]

If the stack argument is provided, the module creates a stack of independent linear operators that are applied to a stack of input vectors.

Parameters:

in_features (int) – The number of input features \(C\).
out_features (int) – The number of output features \(C'\).
bias (bool) – Whether the layer learns an additive bias \(b\) or not.
stack (int | None) – The number of stacked operators \(S\).

forward(x)¶[source]

Parameters:: x (Tensor) – The input tensor \(x\), with shape \((*, C)\) or \((*, S, C)\).
Returns:: The output tensor \(y\), with shape or \((*, C')\) or \((*, S, C')\).
Return type:: Tensor

class zuko.nn.MaskedMLP(adjacency, hidden_features=(64, 64), activation=None, residual=False)¶[source]

Creates a masked multi-layer perceptron (MaskedMLP).

The resulting MLP is a transformation \(y = f(x)\) whose Jacobian entries \(\frac{\partial y_i}{\partial x_j}\) are null if \(A_{ij} = 0\).

Parameters:

adjacency (BoolTensor) – The adjacency matrix \(A \in \{0, 1\}^{M \times N}\).
hidden_features (Sequence[int]) – The numbers of hidden features.
activation (Callable[[], Module] | None) – The activation function constructor. If None, use torch.nn.ReLU instead.
residual (bool) – Whether to use residual blocks or not.

Example

>>> adjacency = torch.randn(4, 3) < 0
>>> adjacency
tensor([[False,  True,  True],
        [False,  True,  True],
        [False, False,  True],
        [ True,  True, False]])
>>> net = MaskedMLP(adjacency, [16, 32], activation=nn.ELU)
>>> net
MaskedMLP(
  (0): MaskedLinear(in_features=3, out_features=16, bias=True)
  (1): ELU(alpha=1.0)
  (2): MaskedLinear(in_features=16, out_features=32, bias=True)
  (3): ELU(alpha=1.0)
  (4): MaskedLinear(in_features=32, out_features=4, bias=True)
)
>>> x = torch.randn(3)
>>> torch.autograd.functional.jacobian(net, x)
tensor([[ 0.0000, -0.0065,  0.1158],
        [ 0.0000, -0.0089,  0.0072],
        [ 0.0000,  0.0000,  0.0089],
        [-0.0146, -0.0128,  0.0000]])

class zuko.nn.MonotonicMLP(*args, **kwargs)¶[source]

Creates a monotonic multi-layer perceptron (MonotonicMLP).

The resulting MLP is a transformation \(y = f(x)\) whose Jacobian entries \(\frac{\partial y_j}{\partial x_i}\) are positive.

Parameters:

args – Positional arguments passed to MLP.
kwargs – Keyword arguments passed to MLP.

Example

>>> net = MonotonicMLP(3, 4, [16, 32])
>>> net
MonotonicMLP(
  (0): MonotonicLinear(in_features=3, out_features=16, bias=True)
  (1): TwoWayELU(alpha=1.0)
  (2): MonotonicLinear(in_features=16, out_features=32, bias=True)
  (3): TwoWayELU(alpha=1.0)
  (4): MonotonicLinear(in_features=32, out_features=4, bias=True)
)
>>> x = torch.randn(3)
>>> torch.autograd.functional.jacobian(net, x)
tensor([[1.0492, 1.3094, 1.1711],
        [1.1201, 1.3825, 1.2711],
        [0.9397, 1.1915, 1.0787],
        [1.1049, 1.3635, 1.2592]])