no code implementations • 3 Apr 2024 • Yifan Qu, Oliver Krzysik, Hans De Sterck, Omer Ege Kara
Graph Neural Networks (GNNs) have established themselves as the preferred methodology in a multitude of domains, ranging from computer vision to computational biology, especially in contexts where data inherently conform to graph structures.
no code implementations • 18 Oct 2023 • Yanming Kang, Giang Tran, Hans De Sterck
The overall complexity of Fast Multipole Attention is $\mathcal{O}(n)$ or $\mathcal{O}(n \log n)$, depending on whether the queries are down-sampled or not.
no code implementations • 30 Sep 2022 • William Zou, Hans De Sterck, Jun Liu
One of the largest bottlenecks in distributed training is communicating gradients across different nodes.
1 code implementation • 4 Jun 2022 • Ruikun Zhou, Thanin Quartz, Hans De Sterck, Jun Liu
This paper proposes a learning framework to simultaneously stabilize an unknown nonlinear system with a neural controller and learn a neural Lyapunov function to certify a region of attraction (ROA) for the closed-loop system.
no code implementations • 29 Sep 2021 • Hans De Sterck, Yunhui He, Oliver A. Krzysik
As a roadway towards gaining more understanding of convergence acceleration by AA, we study AA($m$), i. e., Anderson acceleration with finite window size $m$, applied to the case of linear fixed-point iterations $x_{k+1}=M x_{k}+b$.
no code implementations • 29 Sep 2021 • Hans De Sterck, Yunhui He
However, we show that, despite the discontinuity of $\beta(z)$, the iteration function $\Psi(z)$ is Lipschitz continuous and directionally differentiable at $z^*$ for AA(1), and we generalize this to AA($m$) with $m>1$ for most cases.
1 code implementation • 22 Oct 2020 • Aaron Baier-Reinio, Hans De Sterck
We use neural ordinary differential equations to formulate a variant of the Transformer that is depth-adaptive in the sense that an input-dependent number of time steps is taken by the ordinary differential equation solver.
1 code implementation • 6 Jul 2020 • Da-Wei Wang, Yunhui He, Hans De Sterck
In this paper we explain and quantify this improvement in linear asymptotic convergence speed for the special case of a stationary version of AA applied to ADMM.
1 code implementation • 4 Jul 2020 • Hans De Sterck, Yunhui He
Since AA and NGMRES are equivalent to GMRES in the linear case, one may expect the GMRES convergence factors to be relevant for AA and NGMRES as $x_k \rightarrow x^*$.
1 code implementation • 13 Oct 2018 • Drew Mitchell, Nan Ye, Hans De Sterck
While Nesterov acceleration turns gradient descent into an optimal first-order method for convex problems by adding a momentum term with a specific weight sequence, a direct application of this method and weight sequence to ALS results in erratic convergence behaviour.
1 code implementation • 22 Jan 2016 • Shawn Brunsting, Hans De Sterck, Remco Dolman, Teun van Sprundel
A knowledge base (OpenStreatMap) is then used to find a list of possible locations for each block.