Search Results for author: Maksims Volkovs

Found 23 papers, 13 papers with code

Improving Transformer Optimization Through Better Initialization

1 code implementation ICML 2020 Xiao Shi Huang, Felipe Perez, Jimmy Ba, Maksims Volkovs

As Transformer models are becoming larger and more expensive to train, recent research has focused on understanding and improving optimization in these models.

Language Modelling Machine Translation +1

Improving Transformer Optimization Through Better Initialization

1 code implementation ICML 2020 Xiao Shi Huang, Felipe Perez, Jimmy Ba, Maksims Volkovs

As Transformer models are becoming larger and more expensive to train, recent research has focused on understanding and improving optimization in these models.

Language Modelling Machine Translation +1

Self-supervised Representation Learning From Random Data Projectors

1 code implementation11 Oct 2023 Yi Sui, Tongzi Wu, Jesse C. Cresswell, Ga Wu, George Stein, Xiao Shi Huang, Xiaochen Zhang, Maksims Volkovs

Self-supervised representation learning~(SSRL) has advanced considerably by exploiting the transformation invariance assumption under artificially designed data augmentations.

Data Augmentation Representation Learning

DuETT: Dual Event Time Transformer for Electronic Health Records

1 code implementation25 Apr 2023 Alex Labach, Aslesha Pokhrel, Xiao Shi Huang, Saba Zuberi, Seung Eun Yi, Maksims Volkovs, Tomi Poutanen, Rahul G. Krishnan

Electronic health records (EHRs) recorded in hospital settings typically contain a wide range of numeric time series data that is characterized by high sparsity and irregular observations.

Time Series

DiMS: Distilling Multiple Steps of Iterative Non-Autoregressive Transformers for Machine Translation

1 code implementation7 Jun 2022 Sajad Norouzi, Rasa Hosseinzadeh, Felipe Perez, Maksims Volkovs

The student is optimized to predict the output of the teacher after multiple decoding steps while the teacher follows the student via a slow-moving average.

Machine Translation Translation

X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval

1 code implementation CVPR 2022 Satya Krishna Gorti, Noel Vouitsis, Junwei Ma, Keyvan Golestan, Maksims Volkovs, Animesh Garg, Guangwei Yu

Instead, texts often capture sub-regions of entire videos and are most semantically similar to certain frames within videos.

Ranked #17 on Video Retrieval on LSMDC (using extra training data)

Retrieval Text to Video Retrieval +1

Decentralized Federated Learning through Proxy Model Sharing

1 code implementation22 Nov 2021 Shivam Kalra, Junfeng Wen, Jesse C. Cresswell, Maksims Volkovs, Hamid R. Tizhoosh

Institutions in highly regulated domains such as finance and healthcare often have restrictive rules around data sharing.

Federated Learning whole slide images

Improving Non-Autoregressive Translation Models Without Distillation

no code implementations ICLR 2022 Xiao Shi Huang, Felipe Perez, Maksims Volkovs

Empirically, we show that CMLMC achieves state-of-the-art NAR performance when trained on raw data without distillation and approaches AR performance on multiple datasets.

Language Modelling Machine Translation +1

Temporal Dependencies in Feature Importance for Time Series Predictions

1 code implementation29 Jul 2021 Kin Kwan Leung, Clayton Rooke, Jonathan Smith, Saba Zuberi, Maksims Volkovs

Time series data introduces two key challenges for explainability methods: firstly, observations of the same feature over subsequent time steps are not independent, and secondly, the same feature can have varying importance to model predictions over time.

Feature Importance Time Series +1

Diabetes Mellitus Forecasting Using Population Health Data in Ontario, Canada

no code implementations8 Apr 2019 Mathieu Ravaut, Hamed Sadeghi, Kin Kwan Leung, Maksims Volkovs, Laura C. Rosella

We perform one of the first large-scale machine learning studies with this data to study the task of predicting diabetes in a range of 1-10 years ahead, which requires no additional screening of individuals. In the best setup, we reach a test AUC of 80. 3 with a single-model trained on an observation window of 5 years with a one-year buffer using all datasets.

BIG-bench Machine Learning

DropoutNet: Addressing Cold Start in Recommender Systems

2 code implementations NeurIPS 2017 Maksims Volkovs, Guangwei Yu, Tomi Poutanen

Latent models have become the default choice for recommender systems due to their performance and scalability.

Recommendation Systems

Efficient Sampling for Bipartite Matching Problems

no code implementations NeurIPS 2012 Maksims Volkovs, Richard S. Zemel

Bipartite matching problems characterize many situations, ranging from ranking in information retrieval to correspondence in vision.

Information Retrieval Retrieval

Collaborative Ranking With 17 Parameters

no code implementations NeurIPS 2012 Maksims Volkovs, Richard S. Zemel

The primary application of collaborate filtering (CF) is to recommend a small set of items to a user, which entails ranking.

Collaborative Ranking

Cannot find the paper you are looking for? You can Submit a new open access paper.