Search Results for author: Zheming Yang

Found 1 papers, 1 papers with code

DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers

1 code implementation15 Mar 2024 Xuanlei Zhao, Shenggan Cheng, Zangwei Zheng, Zheming Yang, Ziming Liu, Yang You

Scaling large models with long sequences across applications like language generation, video generation and multimodal tasks requires efficient sequence parallelism.

Text Generation Video Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.