Search Results for author: Hoang Anh Just

Found 6 papers, 5 papers with code

Get more for less: Principled Data Selection for Warming Up Fine-Tuning in LLMs

no code implementations5 May 2024 Feiyang Kang, Hoang Anh Just, Yifan Sun, Himanshu Jahagirdar, Yuanzhi Zhang, Rongxing Du, Anit Kumar Sahu, Ruoxi Jia

The goal is to minimize the need for costly domain-specific data for subsequent fine-tuning while achieving desired performance levels.

Language Modelling

2D-Shapley: A Framework for Fragmented Data Valuation

1 code implementation18 Jun 2023 Zhihong Liu, Hoang Anh Just, Xiangyu Chang, Xi Chen, Ruoxi Jia

Data valuation -- quantifying the contribution of individual data sources to certain predictive behaviors of a model -- is of great importance to enhancing the transparency of machine learning and designing incentive systems for data sharing.

counterfactual Data Valuation +1

LAVA: Data Valuation without Pre-Specified Learning Algorithms

1 code implementation28 Apr 2023 Hoang Anh Just, Feiyang Kang, Jiachen T. Wang, Yi Zeng, Myeongseob Ko, Ming Jin, Ruoxi Jia

(1) We develop a proxy for the validation performance associated with a training set based on a non-conventional class-wise Wasserstein distance between training and validation sets.

Data Valuation

Narcissus: A Practical Clean-Label Backdoor Attack with Limited Information

2 code implementations11 Apr 2022 Yi Zeng, Minzhou Pan, Hoang Anh Just, Lingjuan Lyu, Meikang Qiu, Ruoxi Jia

With poisoning equal to or less than 0. 5% of the target-class data and 0. 05% of the training set, we can train a model to classify test examples from arbitrary classes into the target class when the examples are patched with a backdoor trigger.

Backdoor Attack Clean-label Backdoor Attack (0.024%) +1

Label-Only Model Inversion Attacks via Boundary Repulsion

1 code implementation CVPR 2022 Mostafa Kahla, Si Chen, Hoang Anh Just, Ruoxi Jia

In this paper, we introduce an algorithm, Boundary-Repelling Model Inversion (BREP-MI), to invert private training data using only the target model's predicted labels.

Face Recognition

ModelPred: A Framework for Predicting Trained Model from Training Data

1 code implementation24 Nov 2021 Yingyan Zeng, Jiachen T. Wang, Si Chen, Hoang Anh Just, Ran Jin, Ruoxi Jia

In this work, we propose ModelPred, a framework that helps to understand the impact of changes in training data on a trained model.

Data Valuation Memorization

Cannot find the paper you are looking for? You can Submit a new open access paper.