no code implementations • 29 Mar 2021 • Michael Tetelman
Bayesian Attention Networks are defined by introducing an attention factor per a training sample loss as a function of two sample inputs, from training sample and prediction sample.
no code implementations • 23 Jun 2020 • Michael Tetelman
Finding methods for making generalizable predictions is a fundamental problem of machine learning.
no code implementations • ICLR 2019 • Michael Tetelman
Among stationary solutions of the update rules there are trivial solutions with zero variances at local minima of the original loss and a single non-trivial solution with finite variances that is a critical point at the end of convexity of the effective loss in the mean-variance space.
no code implementations • 19 Dec 2013 • Michael Tetelman
We propose an iterative procedure for deriving a sequence of improving models and a corresponding sequence of sets of non-linear features on the original input space.