Search Results for author: Claire Cui

Found 13 papers, 2 papers with code

Capabilities of Gemini Models in Medicine

no code implementations • 29 Apr 2024 • Khaled Saab, Tao Tu, Wei-Hung Weng, Ryutaro Tanno, David Stutz, Ellery Wulczyn, Fan Zhang, Tim Strother, Chunjong Park, Elahe Vedadi, Juanma Zambrano Chaves, Szu-Yeu Hu, Mike Schaekermann, Aishwarya Kamath, Yong Cheng, David G. T. Barrett, Cathy Cheung, Basil Mustafa, Anil Palepu, Daniel McDuff, Le Hou, Tomer Golany, Luyang Liu, Jean-Baptiste Alayrac, Neil Houlsby, Nenad Tomasev, Jan Freyberg, Charles Lau, Jonas Kemp, Jeremy Lai, Shekoofeh Azizi, Kimberly Kanada, SiWai Man, Kavita Kulkarni, Ruoxi Sun, Siamak Shakeri, Luheng He, Ben Caine, Albert Webson, Natasha Latysheva, Melvin Johnson, Philip Mansfield, Jian Lu, Ehud Rivlin, Jesper Anderson, Bradley Green, Renee Wong, Jonathan Krause, Jonathon Shlens, Ewa Dominowska, S. M. Ali Eslami, Katherine Chou, Claire Cui, Oriol Vinyals, Koray Kavukcuoglu, James Manyika, Jeff Dean, Demis Hassabis, Yossi Matias, Dale Webster, Joelle Barral, Greg Corrado, Christopher Semturs, S. Sara Mahdavi, Juraj Gottweis, Alan Karthikesalingam, Vivek Natarajan

We evaluate Med-Gemini on 14 medical benchmarks, establishing new state-of-the-art (SoTA) performance on 10 of them, and surpass the GPT-4 model family on every benchmark where a direct comparison is viable, often by a wide margin.

In-Context Learning Question Answering +2

Paper
Add Code

Learning to Skip for Language Modeling

no code implementations • 26 Nov 2023 • Dewen Zeng, Nan Du, Tao Wang, Yuanzhong Xu, Tao Lei, Zhifeng Chen, Claire Cui

Overparameterized large-scale language models have impressive generalization performance of in-context few-shot learning.

Few-Shot Learning Language Modelling

Paper
Add Code

Brainformers: Trading Simplicity for Efficiency

no code implementations • 29 May 2023 • Yanqi Zhou, Nan Du, Yanping Huang, Daiyi Peng, Chang Lan, Da Huang, Siamak Shakeri, David So, Andrew Dai, Yifeng Lu, Zhifeng Chen, Quoc Le, Claire Cui, James Laudon, Jeff Dean

Using this insight, we develop a complex block, named Brainformer, that consists of a diverse sets of layers such as sparsely gated feed-forward layers, dense feed-forward layers, attention layers, and various forms of layer normalization and activation functions.

Paper
Add Code

MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks

1 code implementation • 29 Mar 2023 • Weicheng Kuo, AJ Piergiovanni, Dahun Kim, Xiyang Luo, Ben Caine, Wei Li, Abhijit Ogale, Luowei Zhou, Andrew Dai, Zhifeng Chen, Claire Cui, Anelia Angelova

We propose a novel paradigm of training with a decoder-only model for multimodal tasks, which is surprisingly effective in jointly learning of these disparate vision-language tasks.

Ranked #1 on Video Captioning on MSVD

Cross-Modal Retrieval Decoder +8

Paper
Code

Mind's Eye: Grounded Language Model Reasoning through Simulation

no code implementations • 11 Oct 2022 • Ruibo Liu, Jason Wei, Shixiang Shane Gu, Te-Yen Wu, Soroush Vosoughi, Claire Cui, Denny Zhou, Andrew M. Dai

By training solely on written text, current language models (LMs) miss the grounded experience of humans in the real-world -- their failure to relate language to the physical world causes knowledge to be misrepresented and obvious mistakes in their reasoning.

Language Modelling

Paper
Add Code

Least-to-Most Prompting Enables Complex Reasoning in Large Language Models

no code implementations • 21 May 2022 • Denny Zhou, Nathanael Schärli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Claire Cui, Olivier Bousquet, Quoc Le, Ed Chi

Chain-of-thought prompting has demonstrated remarkable performance on various natural language reasoning tasks.

Ranked #97 on Arithmetic Reasoning on GSM8K

Arithmetic Reasoning Math

Paper
Add Code

LaMDA: Language Models for Dialog Applications

2 code implementations • 20 Jan 2022 • Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, Yaguang Li, Hongrae Lee, Huaixiu Steven Zheng, Amin Ghafouri, Marcelo Menegali, Yanping Huang, Maxim Krikun, Dmitry Lepikhin, James Qin, Dehao Chen, Yuanzhong Xu, Zhifeng Chen, Adam Roberts, Maarten Bosma, Vincent Zhao, Yanqi Zhou, Chung-Ching Chang, Igor Krivokon, Will Rusch, Marc Pickett, Pranesh Srinivasan, Laichee Man, Kathleen Meier-Hellstern, Meredith Ringel Morris, Tulsee Doshi, Renelito Delos Santos, Toju Duke, Johnny Soraker, Ben Zevenbergen, Vinodkumar Prabhakaran, Mark Diaz, Ben Hutchinson, Kristen Olson, Alejandra Molina, Erin Hoffman-John, Josh Lee, Lora Aroyo, Ravi Rajakumar, Alena Butryna, Matthew Lamm, Viktoriya Kuzmina, Joe Fenton, Aaron Cohen, Rachel Bernstein, Ray Kurzweil, Blaise Aguera-Arcas, Claire Cui, Marian Croak, Ed Chi, Quoc Le

We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding.

Ranked #114 on Code Generation on HumanEval

Code Generation Information Retrieval +2

458

Paper
Code

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

no code implementations • 13 Dec 2021 • Nan Du, Yanping Huang, Andrew M. Dai, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten Bosma, Zongwei Zhou, Tao Wang, Yu Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathleen Meier-Hellstern, Toju Duke, Lucas Dixon, Kun Zhang, Quoc V Le, Yonghui Wu, Zhifeng Chen, Claire Cui

Scaling language models with more data, compute and parameters has driven significant progress in natural language processing.

Ranked #10 on Language Modelling on LAMBADA

Common Sense Reasoning In-Context Learning +2

Paper
Add Code

Beyond Point Estimate: Inferring Ensemble Prediction Variation from Neuron Activation Strength in Recommender Systems

no code implementations • 17 Aug 2020 • Zhe Chen, Yuyan Wang, Dong Lin, Derek Zhiyuan Cheng, Lichan Hong, Ed H. Chi, Claire Cui

Despite deep neural network (DNN)'s impressive prediction performance in various domains, it is well known now that a set of DNN models trained with the same model specification and the same data can produce very different prediction results.

Model-based Reinforcement Learning Recommendation Systems

Paper
Add Code

Deep Physiological State Space Model for Clinical Forecasting

no code implementations • 4 Dec 2019 • Yuan Xue, Denny Zhou, Nan Du, Andrew Dai, Zhen Xu, Kun Zhang, Claire Cui

Clinical forecasting based on electronic medical records (EMR) can uncover the temporal correlations between patients' conditions and outcomes from sequences of longitudinal clinical measurements.

Paper
Add Code

Modelling EHR timeseries by restricting feature interaction

no code implementations • 14 Nov 2019 • Kun Zhang, Yuan Xue, Gerardo Flores, Alvin Rajkomar, Claire Cui, Andrew M. Dai

Time series data are prevalent in electronic health records, mostly in the form of physiological parameters such as vital signs and lab tests.

Mortality Prediction Time Series +1

Paper
Add Code

Explaining an increase in predicted risk for clinical alerts

no code implementations • 10 Jul 2019 • Michaela Hardt, Alvin Rajkomar, Gerardo Flores, Andrew Dai, Michael Howell, Greg Corrado, Claire Cui, Moritz Hardt

We consider explanations in a temporal setting where a stateful dynamical model produces a sequence of risk estimates given an input at each time step.

Attribute

Paper
Add Code

Scalable and accurate deep learning for electronic health records

no code implementations • 24 Jan 2018 • Alvin Rajkomar, Eyal Oren, Kai Chen, Andrew M. Dai, Nissan Hajaj, Peter J. Liu, Xiaobing Liu, Mimi Sun, Patrik Sundberg, Hector Yee, Kun Zhang, Gavin E. Duggan, Gerardo Flores, Michaela Hardt, Jamie Irvine, Quoc Le, Kurt Litsch, Jake Marcus, Alexander Mossin, Justin Tansuwan, De Wang, James Wexler, Jimbo Wilson, Dana Ludwig, Samuel L. Volchenboum, Katherine Chou, Michael Pearson, Srinivasan Madabushi, Nigam H. Shah, Atul J. Butte, Michael Howell, Claire Cui, Greg Corrado, Jeff Dean

Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.