|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
We describe efforts to adapt the Tesseract open source OCR engine for multiple scripts and languages.
Using wide residual networks as our main baseline, our approach simplifies existing methods that binarize weights by applying the sign function in training; we apply scaling factors for each layer with constant unlearned values equal to the layer-specific standard deviations used for initialization.
We propose Parallel WaveGAN, a distillation-free, fast, and small-footprint waveform generation method using a generative adversarial network.
This paper introduces Random Multimodel Deep Learning (RMDL): a new ensemble, deep learning approach for classification.
Ranked #1 on Document Classification on WOS-11967
In this paper, we propose deep learning-based assessment models to predict human ratings of converted speech.
We introduce a new function-preserving transformation for efficient neural architecture search.
Because of high computational efficiency, our detector can processing 4K Ultra HD video stream in real time (up to 27 fps) on mobile platforms (Intel Ivy Bridge CPUs and Nvidia Kepler GPUs) in searching objects with the dimension 60x60 pixels or higher.
The intuitions are that ads shown together may influence each other, clicked ads reflect a user's preferences, and unclicked ads may indicate what a user dislikes to certain extent.
Ranked #1 on Click-Through Rate Prediction on Avito
Two semimetrics on probability distributions are proposed, given as the sum of differences of expectations of analytic functions evaluated at spatial or frequency locations (i. e, features).