1 code implementation • ICCV 2023 • Yihua Zhang, Ruisi Cai, Tianlong Chen, Guanhua Zhang, huan zhang, Pin-Yu Chen, Shiyu Chang, Zhangyang Wang, Sijia Liu
Since the lack of robustness has become one of the main hurdles for CNNs, in this paper we ask: How to adversarially robustify a CNN-based MoE model?
1 code implementation • 24 Jun 2023 • Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen
Based on these insights, we propose Heavy Hitter Oracle (H$_2$O), a KV cache eviction policy that dynamically retains a balance of recent and H$_2$ tokens.
1 code implementation • 24 Feb 2023 • Ruisi Cai, Zhenyu Zhang, Zhangyang Wang
Given a robust model trained to be resilient to one or multiple types of distribution shifts (e. g., natural image corruptions), how is that "robustness" encoded in the model weights, and how easily can it be disentangled and/or "zero-shot" transferred to some other models?