Search Results for author: Rustem Takhanov

Found 20 papers, 9 papers with code

Multi-layer random features and the approximation power of neural networks

1 code implementation26 Apr 2024 Rustem Takhanov

To achieve a certain approximation error the required number of neurons in each layer is defined by the RKHS norm of the target function.

Gradient Descent Fails to Learn High-frequency Functions and Modular Arithmetic

no code implementations19 Oct 2023 Rustem Takhanov, Maxat Tezekbayev, Artur Pak, Arman Bolatov, Zhenisbek Assylbekov

In the novel framework, the hardness of a class is usually quantified by the variance of the gradient with respect to a random choice of a target function.

Intractability of Learning the Discrete Logarithm with Gradient-Based Methods

1 code implementation2 Oct 2023 Rustem Takhanov, Maxat Tezekbayev, Artur Pak, Arman Bolatov, Zhibek Kadyrsizova, Zhenisbek Assylbekov

The discrete logarithm problem is a fundamental challenge in number theory with significant implications for cryptographic protocols.

Autoencoders for a manifold learning problem with a Jacobian rank constraint

1 code implementation25 Jun 2023 Rustem Takhanov, Y. Sultan Abylkairov, Maxat Tezekbayev

This constraint is included in the objective function as a new term, namely a squared Ky-Fan $k$-antinorm of the Jacobian function.

Computing a partition function of a generalized pattern-based energy over a semiring

no code implementations27 May 2023 Rustem Takhanov

For a general language $\Gamma$ and non-positive weights, the minimization task can be carried out in ${\mathcal O}(|V|\cdot |\overline{\Gamma^{\cap}}|^2)$ time.

Structured Prediction

On the speed of uniform convergence in Mercer's theorem

no code implementations1 May 2022 Rustem Takhanov

The classical Mercer's theorem claims that a continuous positive definite kernel $K({\mathbf x}, {\mathbf y})$ on a compact set can be represented as $\sum_{i=1}^\infty \lambda_i\phi_i({\mathbf x})\phi_i({\mathbf y})$ where $\{(\lambda_i,\phi_i)\}$ are eigenvalue-eigenvector pairs of the corresponding integral operator.

Spectral bounds of the $\varepsilon$-entropy of kernel classes

no code implementations9 Apr 2022 Rustem Takhanov

Further, we develop a series of lower bounds on the $\varepsilon$-entropy that can be established from a connection between covering numbers of a ball in RKHS and a quantization of a Gaussian Random Field that corresponds to the kernel $K$ by the Kosambi-Karhunen-Lo\`eve transform.

Quantization

How many moments does MMD compare?

no code implementations27 Jun 2021 Rustem Takhanov

If ordered singular values of the integral operator associated with $p({\mathbf x}, {\mathbf y})$ die down rapidly, the MMD distance defined by the new symbol $p_r$ differs from the initial one only slightly.

Semantics- and Syntax-related Subvectors in the Skip-gram Embeddings

1 code implementation23 Dec 2019 Maxat Tezekbayev, Zhenisbek Assylbekov, Rustem Takhanov

We show that the skip-gram embedding of any word can be decomposed into two subvectors which roughly correspond to semantic and syntactic roles of the word.

Reducing the dimensionality of data using tempered distributions

1 code implementation12 Mar 2019 Rustem Takhanov

We reformulate unsupervised dimension reduction problem (UDR) in the language of tempered distributions, i. e. as a problem of approximating an empirical probability density function by another tempered distribution, supported in a $k$-dimensional subspace.

Dimensionality Reduction

Context Vectors are Reflections of Word Vectors in Half the Dimensions

no code implementations26 Feb 2019 Zhenisbek Assylbekov, Rustem Takhanov

This paper takes a step towards theoretical analysis of the relationship between word embeddings and context embeddings in models such as word2vec.

Text Generation Word Embeddings

Fourier Neural Networks: A Comparative Study

no code implementations8 Feb 2019 Abylay Zhumekenov, Malika Uteuliyeva, Olzhas Kabdolov, Rustem Takhanov, Zhenisbek Assylbekov, Alejandro J. Castro

We review neural network architectures which were motivated by Fourier series and integrals and which are referred to as Fourier neural networks.

Fourier analysis perspective for sufficient dimension reduction problem

no code implementations19 Aug 2018 Rustem Takhanov

It turns out that the latter problem allows a reformulation in the dual space, i. e. instead of searching for ${\mathbf g}(P{\mathbf x})$ we suggest searching for its Fourier transform.

Dimensionality Reduction

Reproducing and Regularizing the SCRN Model

1 code implementation COLING 2018 Olzhas Kabdolov, Zhenisbek Assylbekov, Rustem Takhanov

We reproduce the Structurally Constrained Recurrent Network (SCRN) model, and then regularize it using the existing widespread techniques, such as naive dropout, variational dropout, and weight tying.

Language Modelling

Reusing Weights in Subword-aware Neural Language Models

1 code implementation NAACL 2018 Zhenisbek Assylbekov, Rustem Takhanov

We propose several ways of reusing subword embeddings and other weights in subword-aware neural language models.

Combining pattern-based CRFs and weighted context-free grammars

no code implementations22 Apr 2014 Rustem Takhanov, Vladimir Kolmogorov

We propose a {\em Grammatical Pattern-Based CRF model }(\GPB) that combines the two in a natural way.

Proceedings of The 38th Annual Workshop of the Austrian Association for Pattern Recognition (ÖAGM), 2014

no code implementations14 Apr 2014 Vladimir Kolmogorov, Christoph Lampert, Emilie Morvant, Rustem Takhanov

The 38th Annual Workshop of the Austrian Association for Pattern Recognition (\"OAGM) will be held at IST Austria, on May 22-23, 2014.

Inference algorithms for pattern-based CRFs on sequence data

no code implementations1 Oct 2012 Rustem Takhanov, Vladimir Kolmogorov

(Komodakis & Paragios, 2009) gave an $O(n L)$ algorithm for computing the MAP.

Cannot find the paper you are looking for? You can Submit a new open access paper.