Search Results for author: Diana Marculescu

Found 43 papers, 17 papers with code

T-VSL: Text-Guided Visual Sound Source Localization in Mixtures

1 code implementation2 Apr 2024 Tanvir Mahmud, Yapeng Tian, Diana Marculescu

Visual sound source localization poses a significant challenge in identifying the semantic region of each sounding source within a video.

Weakly-supervised Audio Separation via Bi-modal Semantic Similarity

1 code implementation2 Apr 2024 Tanvir Mahmud, Saeed Amizadeh, Kazuhito Koishida, Diana Marculescu

Conditional sound separation in multi-source audio mixtures without having access to single source sound data during training is a long standing challenge.

Semantic Similarity Semantic Textual Similarity

PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference

no code implementations24 Mar 2024 Tanvir Mahmud, Burhaneddin Yaman, Chun-Hao Liu, Diana Marculescu

Using this insight, we introduce PaPr, a method for substantially pruning redundant patches with minimal accuracy loss using lightweight ConvNets across a variety of deep learning architectures, including ViTs, ConvNets, and hybrid transformers, without any re-training.

FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer

1 code implementation7 Nov 2023 Chi-Chih Chang, Yuan-Yao Sung, Shixing Yu, Ning-Chi Huang, Diana Marculescu, Kai-Chiang Wu

With our method, a more fine-grained rank configuration can be generated automatically and yield up to 33% extra FLOPs reduction compared to a simple uniform configuration.

Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers

1 code implementation ICCV 2023 Natalia Frumkin, Dibakar Gope, Diana Marculescu

Evol-Q improves the top-1 accuracy of a fully quantized ViT-Base by $10. 30\%$, $0. 78\%$, and $0. 15\%$ for $3$-bit, $4$-bit, and $8$-bit weight quantization levels.

Quantization

MobileTL: On-device Transfer Learning with Inverted Residual Blocks

no code implementations5 Dec 2022 Hung-Yueh Chiang, Natalia Frumkin, Feng Liang, Diana Marculescu

MobileTL trains the shifts for internal normalization layers to avoid storing activation maps for the backward pass.

Transfer Learning

CPT-V: A Contrastive Approach to Post-Training Quantization of Vision Transformers

no code implementations17 Nov 2022 Natalia Frumkin, Dibakar Gope, Diana Marculescu

Borrowing the idea of contrastive loss from self-supervised learning, we find a robust way to jointly minimize a loss function using just 1, 000 calibration images.

Quantization Self-Supervised Learning

AVE-CLIP: AudioCLIP-based Multi-window Temporal Transformer for Audio Visual Event Localization

no code implementations11 Oct 2022 Tanvir Mahmud, Diana Marculescu

An audio-visual event (AVE) is denoted by the correspondence of the visual and auditory signals in a video segment.

audio-visual event localization

QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model Co-Exploration

no code implementations30 Jun 2022 Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Ting-Wu Chin, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

As the machine learning and systems communities strive to achieve higher energy-efficiency through custom deep neural network (DNN) accelerators, varied precision or quantization levels, and model compression techniques, there is a need for design space exploration frameworks that incorporate quantization-aware processing elements into the accelerator design space while having accurate and fast power, performance, and area models.

Model Compression Quantization

Efficient Deep Learning Using Non-Volatile Memory Technology

no code implementations27 Jun 2022 Ahmet Inci, Mehmet Meric Isgenc, Diana Marculescu

DeepNVM++ is demonstrated on STT-/SOT-MRAM technologies and can be used for the characterization, modeling, and analysis of any NVM technology for last-level caches in GPUs for DL applications.

Play It Cool: Dynamic Shifting Prevents Thermal Throttling

no code implementations22 Jun 2022 Yang Zhou, Feng Liang, Ting-Wu Chin, Diana Marculescu

Machine learning (ML) has entered the mobile era where an enormous number of ML models are deployed on edge devices.

SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners

2 code implementations28 May 2022 Feng Liang, Yangguang Li, Diana Marculescu

The proposed Supervised MAE (SupMAE) only exploits a visible subset of image patches for classification, unlike the standard supervised pre-training where all image patches are used.

Representation Learning Transfer Learning

QADAM: Quantization-Aware DNN Accelerator Modeling for Pareto-Optimality

no code implementations20 May 2022 Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

We also show that the proposed lightweight processing elements (LightPEs) consistently achieve Pareto-optimal results in terms of accuracy and hardware-efficiency.

Quantization

QAPPA: Quantization-Aware Power, Performance, and Area Modeling of DNN Accelerators

no code implementations17 May 2022 Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

As the machine learning and systems community strives to achieve higher energy-efficiency through custom DNN accelerators and model compression techniques, there is a need for a design space exploration framework that incorporates quantization-aware processing elements into the accelerator design space while having accurate and fast power, performance, and area models.

Model Compression Quantization

Width Transfer: On the (In)variance of Width Optimization

no code implementations24 Apr 2021 Ting-Wu Chin, Diana Marculescu, Ari S. Morcos

In this work, we propose width transfer, a technique that harnesses the assumptions that the optimized widths (or channel counts) are regular across sizes and depths.

DeepNVM++: Cross-Layer Modeling and Optimization Framework of Non-Volatile Memories for Deep Learning

no code implementations8 Dec 2020 Ahmet Inci, Mehmet Meric Isgenc, Diana Marculescu

Under iso-area assumptions, STT-MRAM and SOT-MRAM provide up to 2x and 2. 3x EDP reduction and accommodate 2. 3x and 3. 3x cache capacity when compared to SRAM, respectively.

The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems

no code implementations8 Dec 2020 Ahmet Inci, Evgeny Bolotin, Yaosheng Fu, Gal Dalal, Shie Mannor, David Nellans, Diana Marculescu

With deep reinforcement learning (RL) methods achieving results that exceed human capabilities in games, robotics, and simulated environments, continued scaling of RL training is crucial to its deployment in solving complex real-world problems.

reinforcement-learning Reinforcement Learning (RL)

PareCO: Pareto-aware Channel Optimization for Slimmable Neural Networks

no code implementations28 Sep 2020 Rudy Chin, Ari S. Morcos, Diana Marculescu

Slimmable neural networks provide a flexible trade-off front between prediction error and computational cost (such as the number of floating-point operations or FLOPs) with the same storage cost as a single model.

One Weight Bitwidth to Rule Them All

no code implementations22 Aug 2020 Ting-Wu Chin, Pierce I-Jen Chuang, Vikas Chandra, Diana Marculescu

Weight quantization for deep ConvNets has shown promising results for applications such as image classification and semantic segmentation and is especially important for applications where memory storage is limited.

Image Classification Model Compression +2

Joslim: Joint Widths and Weights Optimization for Slimmable Neural Networks

2 code implementations23 Jul 2020 Ting-Wu Chin, Ari S. Morcos, Diana Marculescu

In this work, we propose a general framework to enable joint optimization for both width configurations and weights of slimmable networks.

Renofeation: A Simple Transfer Learning Method for Improved Adversarial Robustness

1 code implementation7 Feb 2020 Ting-Wu Chin, Cha Zhang, Diana Marculescu

Fine-tuning through knowledge transfer from a pre-trained model on a large-scale dataset is a widely spread approach to effectively build models on small-scale datasets.

Adversarial Attack Adversarial Robustness +1

On the Pareto Efficiency of Quantized CNN

no code implementations25 Sep 2019 Ting-Wu Chin, Pierce I-Jen Chuang, Vikas Chandra, Diana Marculescu

Weight Quantization for deep convolutional neural networks (CNNs) has shown promising results in compressing and accelerating CNN-powered applications such as semantic segmentation, gesture recognition, and scene understanding.

Gesture Recognition Quantization +2

Single-Path Mobile AutoML: Efficient ConvNet Design and NAS Hyperparameter Optimization

1 code implementation1 Jul 2019 Dimitrios Stamoulis, Ruizhou Ding, Di Wang, Dimitrios Lymberopoulos, Bodhi Priyantha, Jie Liu, Diana Marculescu

In this work, we alleviate the NAS search cost down to less than 3 hours, while achieving state-of-the-art image classification results under mobile latency constraints.

Hyperparameter Optimization Image Classification +1

ViP: Virtual Pooling for Accelerating CNN-based Image Classification and Object Detection

1 code implementation19 Jun 2019 Zhuo Chen, Jiyuan Zhang, Ruizhou Ding, Diana Marculescu

In this paper, we propose Virtual Pooling (ViP), a model-level approach to improve speed and energy consumption of CNN-based image classification and object detection tasks, with a provable error bound.

General Classification Image Classification +3

Single-Path NAS: Device-Aware Efficient ConvNet Design

no code implementations10 May 2019 Dimitrios Stamoulis, Ruizhou Ding, Di Wang, Dimitrios Lymberopoulos, Bodhi Priyantha, Jie Liu, Diana Marculescu

Can we automatically design a Convolutional Network (ConvNet) with the highest image classification accuracy under the latency constraint of a mobile device?

General Classification Image Classification +1

FLightNNs: Lightweight Quantized Deep Neural Networks for Fast and Accurate Inference

no code implementations5 Apr 2019 Ruizhou Ding, Zeye Liu, Ting-Wu Chin, Diana Marculescu, R. D., Blanton

Over 46 FPGA-design experiments involving eight configurations and four data sets reveal that lightweight neural networks with a flexible $k$ value (dubbed FLightNNs) fully utilize the hardware resources on Field Programmable Gate Arrays (FPGAs), our experimental results show that FLightNNs can achieve 2$\times$ speedup when compared to lightweight NNs with $k=2$, with only 0. 1\% accuracy degradation.

Quantization

Single-Path NAS: Designing Hardware-Efficient ConvNets in less than 4 Hours

9 code implementations5 Apr 2019 Dimitrios Stamoulis, Ruizhou Ding, Di Wang, Dimitrios Lymberopoulos, Bodhi Priyantha, Jie Liu, Diana Marculescu

Can we automatically design a Convolutional Network (ConvNet) with the highest image classification accuracy under the runtime constraint of a mobile device?

General Classification Image Classification +1

Regularizing Activation Distribution for Training Binarized Deep Networks

1 code implementation CVPR 2019 Ruizhou Ding, Ting-Wu Chin, Zeye Liu, Diana Marculescu

Binarized Neural Networks (BNNs) can significantly reduce the inference latency and energy consumption in resource-constrained devices due to their pure-logical computation and fewer memory accesses.

AdaScale: Towards Real-time Video Object Detection Using Adaptive Scaling

no code implementations8 Feb 2019 Ting-Wu Chin, Ruizhou Ding, Diana Marculescu

In vision-enabled autonomous systems such as robots and autonomous cars, video object detection plays a crucial role, and both its speed and accuracy are important factors to provide reliable operation.

object-detection Video Object Detection

Understanding the Impact of Label Granularity on CNN-based Image Classification

1 code implementation21 Jan 2019 Zhuo Chen, Ruizhou Ding, Ting-Wu Chin, Diana Marculescu

In this paper, we conduct extensive experiments using various datasets to demonstrate and analyze how and why training based on fine-grain labeling, such as "Persian cat" can improve CNN accuracy on classifying coarse-grain classes, in this case "cat."

General Classification Image Classification

Learning-based Application-Agnostic 3D NoC Design for Heterogeneous Manycore Systems

1 code implementation20 Oct 2018 Biresh Kumar Joardar, Ryan Gary Kim, Janardhan Rao Doppa, Partha Pratim Pande, Diana Marculescu, Radu Marculescu

Our results show that these generalized 3D NoCs only incur a 1. 8% (36-tile system) and 1. 1% (64-tile system) average performance loss compared to application-specific NoCs.

Differentiable Training for Hardware Efficient LightNNs

no code implementations NIPS Workshop CDNNRIA 2018 Ruizhou Ding, Zeye Liu, Ting-Wu Chin, Diana Marculescu, R.D. (Shawn) Blanton

To reduce runtime and resource utilization of Deep Neural Networks (DNNs) on customized hardware, LightNN has been proposed by constraining the weights of DNNs to be a sum of a limited number (denoted as $k\in\{1, 2\}$) of powers of 2.

Quantization

Layer-compensated Pruning for Resource-constrained Convolutional Neural Networks

1 code implementation1 Oct 2018 Ting-Wu Chin, Cha Zhang, Diana Marculescu

Resource-efficient convolution neural networks enable not only the intelligence on edge devices but also opportunities in system-level optimization such as scheduling.

Meta-Learning Scheduling

Hardware-Aware Machine Learning: Modeling and Optimization

no code implementations14 Sep 2018 Diana Marculescu, Dimitrios Stamoulis, Ermao Cai

What is the latency or energy cost for an inference made by a Deep Neural Network (DNN)?

BIG-bench Machine Learning

Designing Adaptive Neural Networks for Energy-Constrained Image Classification

no code implementations5 Aug 2018 Dimitrios Stamoulis, Ting-Wu Chin, Anand Krishnan Prakash, Haocheng Fang, Sribhuvan Sajja, Mitchell Bognar, Diana Marculescu

We cast the design of adaptive CNNs as a hyper-parameter optimization problem with respect to energy, accuracy, and communication constraints imposed by the mobile device.

Bayesian Optimization Classification +2

Tractable Learning and Inference for Large-Scale Probabilistic Boolean Networks

no code implementations23 Jan 2018 Ifigeneia Apostolopoulou, Diana Marculescu

Probabilistic Boolean Networks (PBNs) have been previously proposed so as to gain insights into complex dy- namical systems.

HyperPower: Power- and Memory-Constrained Hyper-Parameter Optimization for Neural Networks

no code implementations6 Dec 2017 Dimitrios Stamoulis, Ermao Cai, Da-Cheng Juan, Diana Marculescu

While selecting the hyper-parameters of Neural Networks (NNs) has been so far treated as an art, the emergence of more complex, deeper architectures poses increasingly more challenges to designers and Machine Learning (ML) practitioners, especially when power and memory constraints need to be considered.

Bayesian Optimization

LightNN: Filling the Gap between Conventional Deep Neural Networks and Binarized Networks

no code implementations2 Dec 2017 Ruizhou Ding, Zeye Liu, Rongye Shi, Diana Marculescu, R. D. Blanton

For a fixed DNN configuration, LightNNs have better accuracy at a slight energy increase than BNNs, yet are more energy efficient with only slightly less accuracy than conventional DNNs.

Machine Learning and Manycore Systems Design: A Serendipitous Symbiosis

no code implementations30 Nov 2017 Ryan Gary Kim, Janardhan Rao Doppa, Partha Pratim Pande, Diana Marculescu, Radu Marculescu

Tight collaboration between experts of machine learning and manycore system design is necessary to create a data-driven manycore design framework that integrates both learning and expert knowledge.

BIG-bench Machine Learning

NeuralPower: Predict and Deploy Energy-Efficient Convolutional Neural Networks

2 code implementations15 Oct 2017 Ermao Cai, Da-Cheng Juan, Dimitrios Stamoulis, Diana Marculescu

We also propose the "energy-precision ratio" (EPR) metric to guide machine learners in selecting an energy-efficient CNN architecture that better trades off the energy consumption and prediction accuracy.

Cannot find the paper you are looking for? You can Submit a new open access paper.