Backbone Architectures

Based on the theoretical analyses in RAN paper, a novel multi-scale backbone structure is designed in the paper. This structure enables the network to efficiently predict motion patterns with larger separable upper bounds by using optimized dilation convolution on high-resolution feature maps, while maintaining a capturable range of motion with low computational complexity.

To quantify the network's capacity for large deformation capturing, the accessible motion capture range is defined as:

Definition 1: Accessible Motion Range

The radius of capture range of the $k^{\text{th}}$-level registration by the registration module$\mathcal{R}_k$ is defined as the smallest upper bound of its accessible Deformation Displacement Field:

$$ a_k := \min_{\mathbf{x}}({\sup(|\varphi_{k}[\mathbf{x}]|_{\infty})}) $$

where $|\cdot|_{\infty}$ denotes the L-$\infty$ norm of a vector, $\sup(\cdot)$ denotes the supremum or the maximum value of a given function with varying inputs and trainable weights of networks, and $\mathbf{x}$ denotes one coordinate entry of the images or Deformation Displacement Fields.

To quantify the Degree-of-Freedom limitation in the discontinuity of the estimated Deformation Displacement Field, we define the separability of the predicted motion:

Definition 2: Separability Bottleneck of Predicted Motion

The motion separability bottleneck is defined as the minimum value of the upper bound of the Chebyshev difference of a network's predicted DDF $\phi$ between two locations $\mathbf{x}, \mathbf{y} \in \mathbb{Z}^d$ with the specific Chebyshev distance $p \in \mathbb{Z}^d$:

$$ \Delta_\infty(p) := \min_{\mathbf{x}, \mathbf{y}}\left{\sup(|\phi[\mathbf{x}] - \phi[\mathbf{y}]|{\infty}) : |\mathbf{x} - \mathbf{y}|{\infty} = p\right} $$

where $p$ denotes the L-$\infty$ distance between the two pixels.

Theorem: Regional Dependency

The upper boundary of motion difference is related to $a_k$ and $p_k$:

$$ \begin{align} \forall \mathbf{x}, \mathbf{y} \in \mathbb{Z}^d, |\mathbf{x} - \mathbf{y}|\infty \geq p{k''} + 2\sum_{k'=k''+1}^{k} a_{k'}, &\quad \sup(|\phi_{k}[\mathbf{x}] - \phi_{k}[\mathbf{y}]|\infty) \geq 2\sum{k'=k''}^{k} a_{k'}; \ \exists \mathbf{x}, \mathbf{y} \in \mathbb{Z}^d, |\mathbf{x} - \mathbf{y}|\infty < p{k''-1} + 2\sum_{k'=k''}^{k} a_{k'}, &\quad \sup(|\phi_{k}[\mathbf{x}] - \phi_{k}[\mathbf{y}]|\infty) = 2\sum{k'=k''}^{k} a_{k'}; \end{align} $$

where $k'', k,$ denote two recursive numbers satisfying $0 \leq k'' < k$, and $\mathbf{x}, \mathbf{y}$ denote two coordinate entries of images or DDFs.

Thus a Motion-Separable structure is designed with the upsampled feature maps processed by the corresponding atrous convolution layers.

Source: Residual Aligner-based Network (RAN): Motion-separable structure for coarse-to-fine discontinuous deformable registration

Papers


Paper Code Results Date Stars

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories