Search Results for author: Sudarshan Srinivasan

Found 14 papers, 4 papers with code

Leveraging Large Language Models to Extract Information on Substance Use Disorder Severity from Clinical Notes: A Zero-shot Learning Approach

no code implementations • 18 Mar 2024 • Maria Mahbub, Gregory M. Dams, Sudarshan Srinivasan, Caitlin Rizy, Ioana Danciu, Jodie Trafton, Kathryn Knight

SUD identification and treatment depend on a variety of factors such as severity, co-determinants (e. g., withdrawal symptoms), and social determinants of health.

Zero-Shot Learning

Paper
Add Code

Dynamic Q&A of Clinical Documents with Large Language Models

no code implementations • 19 Jan 2024 • Ran Elgedawy, Sudarshan Srinivasan, Ioana Danciu

Electronic health records (EHRs) house crucial patient data in clinical notes.

Chatbot Decision Making +3

Paper
Add Code

Question-Answering System Extracts Information on Injection Drug Use from Clinical Notes

1 code implementation • 15 May 2023 • Maria Mahbub, Ian Goethert, Ioana Danciu, Kathryn Knight, Sudarshan Srinivasan, Suzanne Tamang, Karine Rozenberg-Ben-Dror, Hugo Solares, Susana Martins, Jodie Trafton, Edmon Begoli, Gregory Peterson

We also demonstrate the QA model's ability to extract IDU-related information on temporally out-of-distribution data.

Question Answering

Paper
Code

TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Machine Learning

no code implementations • 11 Apr 2023 • William Won, Midhilesh Elavazhagan, Sudarshan Srinivasan, Ajaya Durg, Samvit Kaul, Swati Gupta, Tushar Krishna

To this end, this paper introduces TACOS, an automated synthesizer that generates topology-aware collective algorithms for common distributed machine learning collectives across arbitrary input network topologies.

Paper
Add Code

ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale

3 code implementations • 24 Mar 2023 • William Won, Taekyung Heo, Saeed Rashidi, Srinivas Sridharan, Sudarshan Srinivasan, Tushar Krishna

In this paper, we extend the open-source ASTRA-sim infrastructure and endow it with the capabilities to model state-of-the-art and emerging distributed training models and platforms.

193

Paper
Code

BioADAPT-MRC: Adversarial Learning-based Domain Adaptation Improves Biomedical Machine Reading Comprehension Task

1 code implementation • 26 Feb 2022 • Maria Mahbub, Sudarshan Srinivasan, Edmon Begoli, Gregory D Peterson

We present an adversarial learning-based domain adaptation framework for the biomedical machine reading comprehension task (BioADAPT-MRC), a neural network-based method to address the discrepancies in the marginal distributions between the general and biomedical domain datasets.

Domain Adaptation Machine Reading Comprehension +1

Paper
Code

Themis: A Network Bandwidth-Aware Collective Scheduling Policy for Distributed Training of DL Models

no code implementations • 9 Oct 2021 • Saeed Rashidi, William Won, Sudarshan Srinivasan, Srinivas Sridharan, Tushar Krishna

Distributed training is a solution to reduce DNN training time by splitting the task across multiple NPUs (e. g., GPU/TPU).

Scheduling

Paper
Add Code

Exploring Multi-dimensional Hierarchical Network Topologies for Efficient Distributed Training of Trillion Parameter DL Models

no code implementations • 24 Sep 2021 • William Won, Saeed Rashidi, Sudarshan Srinivasan, Tushar Krishna

High-performance distributed training platforms should leverage multi-dimensional hierarchical networks, which interconnect accelerators through different levels of the network, to dramatically reduce expensive NICs required for the scale-out network.

Paper
Add Code

The Sensitivity of Word Embeddings-based Author Detection Models to Semantic-preserving Adversarial Perturbations

no code implementations • 23 Feb 2021 • Jeremiah Duncan, Fabian Fallas, Chris Gropp, Emily Herron, Maria Mahbub, Paula Olaya, Eduardo Ponce, Tabitha K. Samuel, Daniel Schultz, Sudarshan Srinivasan, Maofeng Tang, Viktor Zenkov, Quan Zhou, Edmon Begoli

To this end, and using those established techniques, we first developed an experimental frame-work for author detection and input perturbations.

Authorship Attribution Misinformation +1

Paper
Add Code

Optimizing Deep Learning Recommender Systems' Training On CPU Cluster Architectures

2 code implementations • 10 May 2020 • Dhiraj Kalamkar, Evangelos Georganas, Sudarshan Srinivasan, Jianping Chen, Mikhail Shiryaev, Alexander Heinecke

During the last two years, the goal of many researchers has been to squeeze the last bit of performance out of HPC system for AI tasks.

Cloud Computing Recommendation Systems

802

Paper
Code

K-TanH: Efficient TanH For Deep Learning

no code implementations • 17 Sep 2019 • Abhisek Kundu, Alex Heinecke, Dhiraj Kalamkar, Sudarshan Srinivasan, Eric C. Qin, Naveen K. Mellempudi, Dipankar Das, Kunal Banerjee, Bharat Kaul, Pradeep Dubey

We propose K-TanH, a novel, highly accurate, hardware efficient approximation of popular activation function TanH for Deep Learning.

Translation

Paper
Add Code

High Performance Scalable FPGA Accelerator for Deep Neural Networks

no code implementations • 29 Aug 2019 • Sudarshan Srinivasan, Pradeep Janedula, Saurabh Dhoble, Sasikanth Avancha, Dipankar Das, Naveen Mellempudi, Bharat Daga, Martin Langhammer, Gregg Baeckler, Bharat Kaul

Low-precision is the first order knob for achieving higher Artificial Intelligence Operations (AI-TOPS).

Vocal Bursts Intensity Prediction

Paper
Add Code

Mixed Precision Training With 8-bit Floating Point

no code implementations • 29 May 2019 • Naveen Mellempudi, Sudarshan Srinivasan, Dipankar Das, Bharat Kaul

Reduced precision computation for deep neural networks is one of the key areas addressing the widening compute gap driven by an exponential growth in model size.

Quantization

Paper
Add Code

A Study of BFLOAT16 for Deep Learning Training

no code implementations • 29 May 2019 • Dhiraj Kalamkar, Dheevatsa Mudigere, Naveen Mellempudi, Dipankar Das, Kunal Banerjee, Sasikanth Avancha, Dharma Teja Vooturi, Nataraj Jammalamadaka, Jianyu Huang, Hector Yuen, Jiyan Yang, Jongsoo Park, Alexander Heinecke, Evangelos Georganas, Sudarshan Srinivasan, Abhisek Kundu, Misha Smelyanskiy, Bharat Kaul, Pradeep Dubey

In this paper, we discuss the flow of tensors and various key operations in mixed precision training, and delve into details of operations, such as the rounding modes for converting FP32 tensors to BFLOAT16.

Image Classification Language Modelling +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.