Search Results for author: Hanxian Huang

Found 5 papers, 1 papers with code

Multi-modal Learning for WebAssembly Reverse Engineering

no code implementations • 4 Apr 2024 • Hanxian Huang, Jishen Zhao

WasmRev is pre-trained using self-supervised learning on a large-scale multi-modal corpus encompassing source code, code documentation and the compiled WebAssembly, without requiring labeled data.

Language Modelling Self-Supervised Learning

Paper
Add Code

GeoT: Tensor Centric Library for Graph Neural Network via Efficient Segment Reduction on GPU

1 code implementation • 3 Apr 2024 • Zhongming Yu, Genghan Zhang, Hanxian Huang, Xin Chen, Jishen Zhao

Yet, efficient tensor-centric frameworks for GNNs remain scarce due to unique challenges and limitations encountered when implementing segment reduction in GNN contexts.

Paper
Code

Learning to Maximize Mutual Information for Chain-of-Thought Distillation

no code implementations • 5 Mar 2024 • Xin Chen, Hanxian Huang, Yanjun Gao, Yi Wang, Jishen Zhao, Ke Ding

Knowledge distillation, the technique of transferring knowledge from large, complex models to smaller ones, marks a pivotal step towards efficient AI deployment.

Knowledge Distillation Language Modelling +1

Paper
Add Code

Sibyl: Forecasting Time-Evolving Query Workloads

no code implementations • 8 Jan 2024 • Hanxian Huang, Tarique Siddiqui, Rana Alotaibi, Carlo Curino, Jyoti Leeka, Alekh Jindal, Jishen Zhao, Jesus Camacho-Rodriguez, Yuanyuan Tian

Drawing insights from real-workloads, we propose template-based featurization techniques and develop a stacked-LSTM with an encoder-decoder architecture for accurate forecasting of query workloads.

Decoder

Paper
Add Code

TripLe: Revisiting Pretrained Model Reuse and Progressive Learning for Efficient Vision Transformer Scaling and Searching

no code implementations • ICCV 2023 • Cheng Fu, Hanxian Huang, Zixuan Jiang, Yun Ni, Lifeng Nai, Gang Wu, Liqun Cheng, Yanqi Zhou, Sheng Li, Andrew Li, Jishen Zhao

One promising way to accelerate transformer training is to reuse small pretrained models to initialize the transformer, as their existing representation power facilitates faster model convergence.

Knowledge Distillation Neural Architecture Search

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.