no code implementations • 22 Apr 2024 • Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, ZiYi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou
We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.
no code implementations • 20 Jun 2023 • Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee, Yuanzhi Li
Despite this small scale, phi-1 attains pass@1 accuracy 50. 6% on HumanEval and 55. 5% on MBPP.
Ranked #42 on Code Generation on HumanEval
2 code implementations • 4 Oct 2022 • Chandan Singh, John X. Morris, Jyoti Aneja, Alexander M. Rush, Jianfeng Gao
Large language models (LLMs) have displayed an impressive ability to harness natural language to perform complex tasks.
8 code implementations • 19 Apr 2022 • Chunyuan Li, Haotian Liu, Liunian Harold Li, Pengchuan Zhang, Jyoti Aneja, Jianwei Yang, Ping Jin, Houdong Hu, Zicheng Liu, Yong Jae Lee, Jianfeng Gao
In general, these language-augmented visual models demonstrate strong transferability to a variety of datasets and tasks.
Ranked #1 on Object Detection on ELEVATER
no code implementations • NeurIPS 2021 • Jyoti Aneja, Alexander Schwing, Jan Kautz, Arash Vahdat
To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.
Ranked #6 on Image Generation on CelebA 256x256 (FID metric)
no code implementations • 28 Sep 2020 • Jyoti Aneja, Alex Schwing, Jan Kautz, Arash Vahdat
To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.
no code implementations • ICCV 2019 • Jyoti Aneja, Harsh Agrawal, Dhruv Batra, Alexander Schwing
We encourage this temporal latent space to capture the 'intention' about how to complete the sentence by mimicking a representation which summarizes the future.
no code implementations • CVPR 2019 • Aditya Deshpande, Jyoti Aneja, Li-Wei Wang, Alexander Schwing, D. A. Forsyth
We achieve the trifecta: (1) High accuracy for the diverse captions as evaluated by standard captioning metrics and user studies; (2) Faster computation of diverse captions compared to beam search and diverse beam search; and (3) High diversity as evaluated by counting novel sentences, distinct n-grams and mutual overlap (i. e., mBleu-4) scores.
4 code implementations • CVPR 2018 • Jyoti Aneja, Aditya Deshpande, Alexander Schwing
In recent years significant progress has been made in image captioning, using Recurrent Neural Networks powered by long-short-term-memory (LSTM) units.