no code implementations • 2 Jan 2024 • Naveen Raman, Mateo Espinosa Zarlenga, Juyeon Heo, Mateja Jamnik
Deep learning models trained under this paradigm heavily rely on the assumption that neural networks can learn to predict the presence or absence of a given concept independently of other concepts.
1 code implementation • 13 Dec 2023 • Vihari Piratla, Juyeon Heo, Katherine M. Collins, Sukriti Singh, Adrian Weller
We believe the improved quality of uncertainty-aware concept explanations make them a strong candidate for more reliable model interpretation.
1 code implementation • 10 Nov 2023 • Weiyang Liu, Zeju Qiu, Yao Feng, Yuliang Xiu, Yuxuan Xue, Longhui Yu, Haiwen Feng, Zhen Liu, Juyeon Heo, Songyou Peng, Yandong Wen, Michael J. Black, Adrian Weller, Bernhard Schölkopf
We apply this parameterization to OFT, creating a novel parameter-efficient finetuning method, called Orthogonal Butterfly (BOFT).
no code implementations • 26 Jun 2023 • Wenlin Chen, Julien Horwood, Juyeon Heo, José Miguel Hernández-Lobato
This work extends the theory of identifiability in supervised learning by considering the consequences of having access to a distribution of tasks.
1 code implementation • NeurIPS 2023 • Juyeon Heo, Vihari Piratla, Matthew Wicker, Adrian Weller
Machine learning from explanations (MLX) is an approach to learning that uses human-provided explanations of relevant or irrelevant features for each input to ensure that model predictions are right for the right reasons.
1 code implementation • 16 Dec 2022 • Matthew Wicker, Juyeon Heo, Luca Costabello, Adrian Weller
Post-hoc explanation methods are used with the intent of providing insights about neural networks and are sometimes said to help engender trust in their outputs.
1 code implementation • 29 Nov 2022 • Sunghwan Joo, Seokhyeon Jeong, Juyeon Heo, Adrian Weller, Taesup Moon
However, the lack of considering the normalization of the attributions, which is essential in their visualizations, has been an obstacle to understanding and improving the robustness of feature attribution methods.
3 code implementations • NeurIPS 2019 • Juyeon Heo, Sunghwan Joo, Taesup Moon
We ask whether the neural network interpretation methods can be fooled via adversarial model manipulation, which is defined as a model fine-tuning step that aims to radically alter the explanations without hurting the accuracy of the original models, e. g., VGG19, ResNet50, and DenseNet121.