Mining limited data for more robust and generalized ML models

For the last ten years, the success of deep learning comes from big data and large models. In recent years, the architecture of neural networks has been very mature. It’s more efficient to look for ways improving the data based a fixed neural network architecture. Similarly, in terms of robust machine learning, defense methods based on deep learning models have been proposed to mitigate potential threats about adversarial samples, but most of them pursue high-performance models under fixed constraints and dataset. Therefore, how to construct universal and effective dataset to train robust models is still a problem to be explored. In this paper, we consider to generate a more robust and efficient machine learning model by mining limited data. In detail, we proposed Robust TrivialAugment(RTA) and Iterative Search to obtain better dataset to make the trained model achieve better performance with the same amount of data. Moreover, we have applied the proposed methods to competition AAAI2022 DataCentric Robust Learning on ML Models that is organized by Alibaba on the Tianchi platform and won top 10 in 3691 teams. Code is available at https://github.com/wujiekd/RTAIterative-Search-AAAI2022.

PDF

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here