Search Results for author: Joel Shor

Found 18 papers, 5 papers with code

Predicting Generalization of AI Colonoscopy Models to Unseen Data

no code implementations14 Mar 2024 Joel Shor, Carson McNeil, Yotam Intrator, Joseph R Ledsam, Hiro-o Yamano, Daisuke Tsurumaru, Hiroki Kayama, Atsushi Hamabe, Koji Ando, Mitsuhiko Ota, Haruei Ogino, Hiroshi Nakase, Kaho Kobayashi, Masaaki Miyo, Eiji Oki, Ichiro Takemasa, Ehud Rivlin, Roman Goldenberg

We test MSN's ability to be trained on data only from Israel and detect unseen techniques, narrow-band imaging (NBI) and chromendoscoy (CE), on colonoscopes from Japan (354 videos, 128 hours).

The unreasonable effectiveness of AI CADe polyp detectors to generalize to new countries

no code implementations11 Dec 2023 Joel Shor, Hiro-o Yamano, Daisuke Tsurumaru, Yotami Intrator, Hiroki Kayama, Joe Ledsam, Atsushi Hamabe, Koji Ando, Mitsuhiko Ota, Haruei Ogino, Hiroshi Nakase, Kaho Kobayashi, Eiji Oki, Roman Goldenberg, Ehud Rivlin, Ichiro Takemasa

$\textbf{Conclusion}$: Differences that prevent CADe detectors from performing well in non-medical settings do not degrade the performance of our AI CADe polyp detector when applied to data from a new country.

Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical Settings

no code implementations10 Mar 2023 Joel Shor, Ruyue Agnes Bi, Subhashini Venugopalan, Steven Ibara, Roman Goldenberg, Ehud Rivlin

We demonstrate that this metric more closely aligns with clinician preferences on medical sentences as compared to other metrics (WER, BLUE, METEOR, etc), sometimes by wide margins.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

The Need for Medically Aware Video Compression in Gastroenterology

no code implementations2 Nov 2022 Joel Shor, Nick Johnston

Compression is essential to storing and transmitting medical videos, but the effect of compression on downstream medical tasks is often ignored.

Video Compression

TRILLsson: Distilled Universal Paralinguistic Speech Representations

no code implementations1 Mar 2022 Joel Shor, Subhashini Venugopalan

Our largest distilled model is less than 15% the size of the original model (314MB vs 2. 2GB), achieves over 96% the accuracy on 6 of 7 tasks, and is trained on 6. 5% the data.

Emotion Recognition Knowledge Distillation

Universal Paralinguistic Speech Representations Using Self-Supervised Conformers

no code implementations9 Oct 2021 Joel Shor, Aren Jansen, Wei Han, Daniel Park, Yu Zhang

Many speech applications require understanding aspects beyond the words being spoken, such as recognizing emotion, detecting whether the speaker is wearing a mask, or distinguishing real from synthetic speech.

Towards Learning a Universal Non-Semantic Representation of Speech

1 code implementation25 Feb 2020 Joel Shor, Aren Jansen, Ronnie Maor, Oran Lang, Omry Tuval, Felix de Chaumont Quitry, Marco Tagliasacchi, Ira Shavitt, Dotan Emanuel, Yinnon Haviv

The ultimate goal of transfer learning is to reduce labeled data requirements by exploiting a pre-existing embedding model trained for different datasets or tasks.

Transfer Learning

Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron

2 code implementations ICML 2018 RJ Skerry-Ryan, Eric Battenberg, Ying Xiao, Yuxuan Wang, Daisy Stanton, Joel Shor, Ron J. Weiss, Rob Clark, Rif A. Saurous

We present an extension to the Tacotron speech synthesis architecture that learns a latent embedding space of prosody, derived from a reference acoustic representation containing the desired prosody.

Expressive Speech Synthesis

Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

11 code implementations ICML 2018 Yuxuan Wang, Daisy Stanton, Yu Zhang, RJ Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Fei Ren, Ye Jia, Rif A. Saurous

In this work, we propose "global style tokens" (GSTs), a bank of embeddings that are jointly trained within Tacotron, a state-of-the-art end-to-end speech synthesis system.

Speech Synthesis Style Transfer +1

Spatially adaptive image compression using a tiled deep network

no code implementations7 Feb 2018 David Minnen, George Toderici, Michele Covell, Troy Chinen, Nick Johnston, Joel Shor, Sung Jin Hwang, Damien Vincent, Saurabh Singh

Deep neural networks represent a powerful class of function approximators that can learn to compress and reconstruct images.

Image Compression

Target-Quality Image Compression with Recurrent, Convolutional Neural Networks

no code implementations18 May 2017 Michele Covell, Nick Johnston, David Minnen, Sung Jin Hwang, Joel Shor, Saurabh Singh, Damien Vincent, George Toderici

Our methods introduce a multi-pass training method to combine the training goals of high-quality reconstructions in areas around stop-code masking as well as in highly-detailed areas.

Image Compression

Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks

no code implementations CVPR 2018 Nick Johnston, Damien Vincent, David Minnen, Michele Covell, Saurabh Singh, Troy Chinen, Sung Jin Hwang, Joel Shor, George Toderici

We propose a method for lossy image compression based on recurrent, convolutional neural networks that outperforms BPG (4:2:0 ), WebP, JPEG2000, and JPEG as measured by MS-SSIM.

Image Compression MS-SSIM +1

Full Resolution Image Compression with Recurrent Neural Networks

7 code implementations CVPR 2017 George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David Minnen, Joel Shor, Michele Covell

As far as we know, this is the first neural network architecture that is able to outperform JPEG at image compression across most bitrates on the rate-distortion curve on the Kodak dataset images, with and without the aid of entropy coding.

Image Compression

Cannot find the paper you are looking for? You can Submit a new open access paper.