Search Results for author: Jiyan Yang

Found 24 papers, 4 papers with code

Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale

no code implementations14 Nov 2023 Wei Wen, Kuang-Hung Liu, Igor Fedorov, Xin Zhang, Hang Yin, Weiwei Chu, Kaveh Hassani, Mengying Sun, Jiang Liu, Xu Wang, Lin Jiang, Yuxin Chen, Buyun Zhang, Xi Liu, Dehua Cheng, Zhengxing Chen, Guang Zhao, Fangqiu Han, Jiyan Yang, Yuchen Hao, Liang Xiong, Wen-Yen Chen

In industry system, such as ranking system in Meta, it is unclear whether NAS algorithms from the literature can outperform production baselines because of: (1) scale - Meta ranking systems serve billions of users, (2) strong baselines - the baselines are production models optimized by hundreds to thousands of world-class engineers for years since the rise of deep learning, (3) dynamic baselines - engineers may have established new and stronger baselines during NAS search, and (4) efficiency - the search pipeline must yield results quickly in alignment with the productionization life cycle.

Neural Architecture Search

Towards the Better Ranking Consistency: A Multi-task Learning Framework for Early Stage Ads Ranking

no code implementations12 Jul 2023 Xuewei Wang, Qiang Jin, Shengyu Huang, Min Zhang, Xi Liu, Zhengli Zhao, Yukun Chen, Zhengyu Zhang, Jiyan Yang, Ellie Wen, Sagar Chordia, Wenlin Chen, Qin Huang

In order to pass better ads from the early to the final stage ranking, we propose a multi-task learning framework for early stage ranking to capture multiple final stage ranking components (i. e. ads clicks and ads quality events) and their task relations.

Multi-Task Learning

AdaTT: Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations

1 code implementation11 Apr 2023 Danwei Li, Zhengyu Zhang, Siyang Yuan, Mingze Gao, Weilin Zhang, Chaofei Yang, Xi Liu, Jiyan Yang

However, MTL research faces two challenges: 1) effectively modeling the relationships between tasks to enable knowledge sharing, and 2) jointly learning task-specific and shared knowledge.

Multi-Task Learning

DHEN: A Deep and Hierarchical Ensemble Network for Large-Scale Click-Through Rate Prediction

no code implementations11 Mar 2022 Buyun Zhang, Liang Luo, Xi Liu, Jay Li, Zeliang Chen, Weilin Zhang, Xiaohan Wei, Yuchen Hao, Michael Tsang, Wenjun Wang, Yang Liu, Huayu Li, Yasmine Badr, Jongsoo Park, Jiyan Yang, Dheevatsa Mudigere, Ellie Wen

To overcome the challenge brought by DHEN's deeper and multi-layer structure in training, we propose a novel co-designed training system that can further improve the training efficiency of DHEN.

Click-Through Rate Prediction

CPR: Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery

no code implementations5 Nov 2020 Kiwan Maeng, Shivam Bharuka, Isabel Gao, Mark C. Jeffrey, Vikram Saraph, Bor-Yiing Su, Caroline Trippel, Jiyan Yang, Mike Rabbat, Brandon Lucia, Carole-Jean Wu

The paper is the first to the extent of our knowledge to perform a data-driven, in-depth analysis of applying partial recovery to recommendation models and identified a trade-off between accuracy and performance.

Adaptive Dense-to-Sparse Paradigm for Pruning Online Recommendation System with Non-Stationary Data

no code implementations16 Oct 2020 Mao Ye, Dhruv Choudhary, Jiecao Yu, Ellie Wen, Zeliang Chen, Jiyan Yang, Jongsoo Park, Qiang Liu, Arun Kejariwal

To the best of our knowledge, this is the first work to provide in-depth analysis and discussion of applying pruning to online recommendation systems with non-stationary data distribution.

Recommendation Systems

Towards Automated Neural Interaction Discovery for Click-Through Rate Prediction

no code implementations29 Jun 2020 Qingquan Song, Dehua Cheng, Hanning Zhou, Jiyan Yang, Yuandong Tian, Xia Hu

Click-Through Rate (CTR) prediction is one of the most important machine learning tasks in recommender systems, driving personalized experience for billions of consumers.

Click-Through Rate Prediction Learning-To-Rank +2

Deep Learning Training in Facebook Data Centers: Design of Scale-up and Scale-out Systems

no code implementations20 Mar 2020 Maxim Naumov, John Kim, Dheevatsa Mudigere, Srinivas Sridharan, Xiaodong Wang, Whitney Zhao, Serhat Yilmaz, Changkyu Kim, Hector Yuen, Mustafa Ozdal, Krishnakumar Nair, Isabel Gao, Bor-Yiing Su, Jiyan Yang, Mikhail Smelyanskiy

Large-scale training is important to ensure high performance and accuracy of machine-learning models.

Distributed, Parallel, and Cluster Computing 68T05, 68M10 H.3.3; I.2.6; C.2.1

Post-Training 4-bit Quantization on Embedding Tables

no code implementations5 Nov 2019 Hui Guan, Andrey Malevich, Jiyan Yang, Jongsoo Park, Hector Yuen

Continuous representations have been widely adopted in recommender systems where a large number of entities are represented using embedding vectors.

Quantization Recommendation Systems

Mixed Dimension Embeddings with Application to Memory-Efficient Recommendation Systems

6 code implementations25 Sep 2019 Antonio Ginart, Maxim Naumov, Dheevatsa Mudigere, Jiyan Yang, James Zou

Embedding representations power machine intelligence in many applications, including recommendation systems, but they are space intensive -- potentially occupying hundreds of gigabytes in large-scale settings.

Click-Through Rate Prediction Collaborative Filtering +1

Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems

6 code implementations4 Sep 2019 Hao-Jun Michael Shi, Dheevatsa Mudigere, Maxim Naumov, Jiyan Yang

We propose a novel approach for reducing the embedding size in an end-to-end fashion by exploiting complementary partitions of the category set to produce a unique embedding vector for each category without explicit definition.

Recommendation Systems

Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies

1 code implementation5 Jul 2016 Alex Gittens, Aditya Devarakonda, Evan Racah, Michael Ringenburg, Lisa Gerhardt, Jey Kottalam, Jialin Liu, Kristyn Maschhoff, Shane Canon, Jatin Chhugani, Pramod Sharma, Jiyan Yang, James Demmel, Jim Harrell, Venkat Krishnamurthy, Michael W. Mahoney, Prabhat

We explore the trade-offs of performing linear algebra using Apache Spark, compared to traditional C and MPI implementations on HPC platforms.

Distributed, Parallel, and Cluster Computing G.1.3; C.2.4

Sub-sampled Newton Methods with Non-uniform Sampling

no code implementations NeurIPS 2016 Peng Xu, Jiyan Yang, Farbod Roosta-Khorasani, Christopher Ré, Michael W. Mahoney

As second-order methods prove to be effective in finding the minimizer to a high-precision, in this work, we propose randomized Newton-type algorithms that exploit \textit{non-uniform} sub-sampling of $\{\nabla^2 f_i(w)\}_{i=1}^{n}$, as well as inexact updates, as means to reduce the computational complexity.

Second-order methods

Tensor machines for learning target-specific polynomial features

no code implementations7 Apr 2015 Jiyan Yang, Alex Gittens

Recent years have demonstrated that using random feature maps can significantly decrease the training and testing times of kernel-based algorithms without significantly lowering their accuracy.

Weighted SGD for $\ell_p$ Regression with Randomized Preconditioning

no code implementations12 Feb 2015 Jiyan Yang, Yin-Lam Chow, Christopher Ré, Michael W. Mahoney

We aim to bridge the gap between these two methods in solving constrained overdetermined linear regression problems---e. g., $\ell_2$ and $\ell_1$ regression problems.

regression

Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments

no code implementations10 Feb 2015 Jiyan Yang, Xiangrui Meng, Michael W. Mahoney

and demonstrate that $\ell_1$ and $\ell_2$ regression problems can be solved to low, medium, or high precision in existing distributed systems on up to terabyte-sized data.

regression

Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels

no code implementations29 Dec 2014 Haim Avron, Vikas Sindhwani, Jiyan Yang, Michael Mahoney

These approximate feature maps arise as Monte Carlo approximations to integral representations of shift-invariant kernel functions (e. g., Gaussian kernel).

Random Laplace Feature Maps for Semigroup Kernels on Histograms

no code implementations CVPR 2014 Jiyan Yang, Vikas Sindhwani, Quanfu Fan, Haim Avron, Michael W. Mahoney

With the goal of accelerating the training and testing complexity of nonlinear kernel methods, several recent papers have proposed explicit embeddings of the input data into low-dimensional feature spaces, where fast linear methods can instead be used to generate approximate solutions.

Event Detection Image Classification

Quantile Regression for Large-scale Applications

no code implementations1 May 2013 Jiyan Yang, Xiangrui Meng, Michael W. Mahoney

Our empirical evaluation illustrates that our algorithm is competitive with the best previous work on small to medium-sized problems, and that in addition it can be implemented in MapReduce-like environments and applied to terabyte-sized problems.

regression

Cannot find the paper you are looking for? You can Submit a new open access paper.