zuko.nn#

Neural networks, layers and modules.

Classes#

Linear

Creates a linear layer.

MLP

Creates a multi-layer perceptron (MLP).

MaskedMLP

Creates a masked multi-layer perceptron (MaskedMLP).

MonotonicMLP

Creates a monotonic multi-layer perceptron (MonotonicMLP).

Descriptions#

class zuko.nn.Linear(in_features, out_features, bias=True, stack=None)#

Creates a linear layer.

\[y = x W^T + b\]

If the stack argument is provided, the module creates a stack of independent linear operators that are applied to a stack of input vectors.

Parameters:
  • in_features (int) – The number of input features \(C\).

  • out_features (int) – The number of output features \(C'\).

  • bias (bool) – Whether the layer learns an additive bias \(b\) or not.

  • stack (int) – The number of stacked operators \(S\).

forward(x)#
Parameters:

x (Tensor) – The input tensor \(x\), with shape \((*, C)\) or \((*, S, C)\).

Returns:

The output tensor \(y\), with shape or \((*, C')\) or \((*, S, C')\).

Return type:

Tensor

class zuko.nn.MLP(in_features, out_features, hidden_features=(64, 64), activation=None, normalize=False, **kwargs)#

Creates a multi-layer perceptron (MLP).

Also known as fully connected feedforward network, an MLP is a sequence of non-linear parametric functions

\[h_{i + 1} = a_{i + 1}(h_i W_{i + 1}^T + b_{i + 1}),\]

over feature vectors \(h_i\), with the input and output feature vectors \(x = h_0\) and \(y = h_L\), respectively. The non-linear functions \(a_i\) are called activation functions. The trainable parameters of an MLP are its weights and biases \(\phi = \{W_i, b_i | i = 1, \dots, L\}\).

Wikipedia

https://wikipedia.org/wiki/Feedforward_neural_network

Parameters:
  • in_features (int) – The number of input features.

  • out_features (int) – The number of output features.

  • hidden_features (Sequence[int]) – The numbers of hidden features.

  • activation (Callable[[], Module]) – The activation function constructor. If None, use torch.nn.ReLU instead.

  • normalize (bool) – Whether features are normalized between layers or not.

  • kwargs – Keyword arguments passed to Linear.

Example

>>> net = MLP(64, 1, [32, 16], activation=nn.ELU)
>>> net
MLP(
  (0): Linear(in_features=64, out_features=32, bias=True)
  (1): ELU(alpha=1.0)
  (2): Linear(in_features=32, out_features=16, bias=True)
  (3): ELU(alpha=1.0)
  (4): Linear(in_features=16, out_features=1, bias=True)
)
class zuko.nn.MaskedMLP(adjacency, hidden_features=(64, 64), activation=None, residual=False)#

Creates a masked multi-layer perceptron (MaskedMLP).

The resulting MLP is a transformation \(y = f(x)\) whose Jacobian entries \(\frac{\partial y_i}{\partial x_j}\) are null if \(A_{ij} = 0\).

Parameters:
  • adjacency (BoolTensor) – The adjacency matrix \(A \in \{0, 1\}^{M \times N}\).

  • hidden_features (Sequence[int]) – The numbers of hidden features.

  • activation (Callable[[], Module]) – The activation function constructor. If None, use torch.nn.ReLU instead.

  • residual (bool) – Whether to use residual blocks or not.

Example

>>> adjacency = torch.randn(4, 3) < 0
>>> adjacency
tensor([[False,  True, False],
        [ True, False,  True],
        [False,  True, False],
        [False,  True,  True]])
>>> net = MaskedMLP(adjacency, [16, 32], activation=nn.ELU)
>>> net
MaskedMLP(
  (0): MaskedLinear(in_features=3, out_features=16, bias=True)
  (1): ELU(alpha=1.0)
  (2): MaskedLinear(in_features=16, out_features=32, bias=True)
  (3): ELU(alpha=1.0)
  (4): MaskedLinear(in_features=32, out_features=4, bias=True)
)
>>> x = torch.randn(3)
>>> torch.autograd.functional.jacobian(net, x)
tensor([[ 0.0000,  0.0031,  0.0000],
        [-0.0323,  0.0000, -0.0547],
        [ 0.0000, -0.0245,  0.0000],
        [ 0.0000,  0.0060, -0.0063]])
class zuko.nn.MonotonicMLP(*args, **kwargs)#

Creates a monotonic multi-layer perceptron (MonotonicMLP).

The resulting MLP is a transformation \(y = f(x)\) whose Jacobian entries \(\frac{\partial y_j}{\partial x_i}\) are positive.

Parameters:
  • args – Positional arguments passed to MLP.

  • kwargs – Keyword arguments passed to MLP.

Example

>>> net = MonotonicMLP(3, 4, [16, 32])
>>> net
MonotonicMLP(
  (0): MonotonicLinear(in_features=3, out_features=16, bias=True)
  (1): TwoWayELU(alpha=1.0)
  (2): MonotonicLinear(in_features=16, out_features=32, bias=True)
  (3): TwoWayELU(alpha=1.0)
  (4): MonotonicLinear(in_features=32, out_features=4, bias=True)
)
>>> x = torch.randn(3)
>>> torch.autograd.functional.jacobian(net, x)
tensor([[0.8742, 0.9439, 0.9759],
        [0.8969, 0.9716, 0.9866],
        [1.0780, 1.1651, 1.2056],
        [0.8596, 0.9400, 0.9502]])