1 code implementation • 1 Dec 2023 • Satya Sai Srinath Namburi, Makesh Sreedhar, Srinath Srinivasan, Frederic Sala
Two standard compression techniques are pruning and quantization, with the former eliminating redundant connections in model layers and the latter representing model parameters with fewer bits.
1 code implementation • 16 Oct 2023 • Traian Rebedea, Razvan Dinu, Makesh Sreedhar, Christopher Parisien, Jonathan Cohen
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
no code implementations • 1 Jul 2023 • Gowtham Ramesh, Makesh Sreedhar, Junjie Hu
Recent generative approaches for multi-hop question answering (QA) utilize the fusion-in-decoder method~\cite{izacard-grave-2021-leveraging} to generate a single sequence output which includes both a final answer and a reasoning path taken to arrive at that answer, such as passage titles and key facts from those passages.