Search Results for author: Tong Geng

Found 25 papers, 9 papers with code

Prototypical Transformer as Unified Motion Learners

no code implementations • 3 Jun 2024 • Cheng Han, Yawen Lu, Guohao Sun, James C. Liang, Zhiwen Cao, Qifan Wang, Qiang Guan, Sohail A. Dianat, Raghuveer M. Rao, Tong Geng, Zhiqiang Tao, Dongfang Liu

In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified framework that approaches various motion tasks from a prototype perspective.

Paper
Add Code

Accurate and Data-Efficient Micro-XRD Phase Identification Using Multi-Task Learning: Application to Hydrothermal Fluids

no code implementations • 15 Mar 2024 • Yanfei Li, Juejing Liu, Xiaodong Zhao, Wenjun Liu, Tong Geng, Ang Li, Xin Zhang

Traditional analysis of highly distorted micro-X-ray diffraction ({\mu}-XRD) patterns from hydrothermal fluid environments is a time-consuming process, often requiring substantial data preprocessing and labeled experimental data.

Binary Classification Multi-Task Learning

Paper
Add Code

Evaluating Emerging AI/ML Accelerators: IPU, RDU, and NVIDIA/AMD GPUs

no code implementations • 8 Nov 2023 • Hongwu Peng, Caiwen Ding, Tong Geng, Sutanay Choudhury, Kevin Barker, Ang Li

The relentless advancement of artificial intelligence (AI) and machine learning (ML) applications necessitates the development of specialized hardware accelerators capable of handling the increasing complexity and computational demands.

Paper
Add Code

ClusterFormer: Clustering As A Universal Visual Learner

1 code implementation • 22 Sep 2023 • James C. Liang, Yiming Cui, Qifan Wang, Tong Geng, Wenguan Wang, Dongfang Liu

This paper presents CLUSTERFORMER, a universal vision model that is based on the CLUSTERing paradigm with TransFORMER.

Clustering Image Classification +7

Paper
Code

Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks

1 code implementation • 22 Aug 2023 • Xi Xie, Hongwu Peng, Amit Hasan, Shaoyi Huang, Jiahui Zhao, Haowen Fang, Wei zhang, Tong Geng, Omer Khan, Caiwen Ding

Utilizing these principles, we formulated a kernel for sparse matrix multiplication (SpMM) in GCNs that employs block-level partitioning and combined warp strategy.

Computational Efficiency

Paper
Code

TransFlow: Transformer as Flow Learner

no code implementations • CVPR 2023 • Yawen Lu, Qifan Wang, Siqi Ma, Tong Geng, Yingjie Victor Chen, Huaijin Chen, Dongfang Liu

Optical flow is an indispensable building block for various important computer vision tasks, including motion estimation, object tracking, and disparity measurement.

Motion Estimation object-detection +4

Paper
Add Code

Machine Learning Automated Approach for Enormous Synchrotron X-Ray Diffraction Data Interpretation

no code implementations • 20 Mar 2023 • Xiaodong Zhao, YiXuan Luo, Juejing Liu, Wenjun Liu, Kevin M. Rosso, Xiaofeng Guo, Tong Geng, Ang Li, Xin Zhang

This study highlighted the importance of labeled experimental patterns on the training of DNN models to solve u-XRD mapping data from in-situ experiments involving liquid phase.

Paper
Add Code

RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference

no code implementations • 5 Feb 2023 • Hongwu Peng, Shanglin Zhou, Yukui Luo, Nuo Xu, Shijin Duan, Ran Ran, Jiahui Zhao, Shaoyi Huang, Xi Xie, Chenghong Wang, Tong Geng, Wujie Wen, Xiaolin Xu, Caiwen Ding

The proliferation of deep learning (DL) has led to the emergence of privacy and security concerns.

Privacy Preserving

Paper
Add Code

Towards Real-Time Temporal Graph Learning

1 code implementation • 8 Oct 2022 • Deniz Gurevin, Mohsin Shan, Tong Geng, Weiwen Jiang, Caiwen Ding, Omer Khan

Prior work operates on pre-collected temporal graph data and is not designed to handle updates on a graph in real-time.

graph construction Graph Learning +3

Paper
Code

PolyMPCNet: Towards ReLU-free Neural Architecture Search in Two-party Computation Based Private Inference

no code implementations • 20 Sep 2022 • Hongwu Peng, Shanglin Zhou, Yukui Luo, Shijin Duan, Nuo Xu, Ran Ran, Shaoyi Huang, Chenghong Wang, Tong Geng, Ang Li, Wujie Wen, Xiaolin Xu, Caiwen Ding

The rapid growth and deployment of deep learning (DL) has witnessed emerging privacy and security concerns.

Neural Architecture Search Privacy Preserving

Paper
Add Code

MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms

1 code implementation • 14 Sep 2022 • yuke wang, Boyuan Feng, Zheng Wang, Tong Geng, Kevin Barker, Ang Li, Yufei Ding

For irregularly sparse and fine-grained GNN workloads, such solutions miss the opportunity to jointly schedule/optimize the computation and communication operations for high-performance delivery.

Layout Design Management

Paper
Code

Towards Sparsification of Graph Neural Networks

1 code implementation • 11 Sep 2022 • Hongwu Peng, Deniz Gurevin, Shaoyi Huang, Tong Geng, Weiwen Jiang, Omer Khan, Caiwen Ding

In this paper, we utilize two state-of-the-art model compression methods (1) train and prune and (2) sparse training for the sparsification of weight layers in GNNs.

Image Classification Link Prediction +4

Paper
Code

A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA Through Sparse Attention and Dynamic Pipelining

no code implementations • 7 Aug 2022 • Hongwu Peng, Shaoyi Huang, Shiyang Chen, Bingbing Li, Tong Geng, Ang Li, Weiwen Jiang, Wujie Wen, Jinbo Bi, Hang Liu, Caiwen Ding

Particularly, we develop a hardware-friendly sparse attention operator and a length-aware hardware resource scheduling algorithm.

Scheduling Sentence

Paper
Add Code

H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture

no code implementations • 28 Jun 2022 • Chengming Zhang, Tong Geng, Anqi Guo, Jiannan Tian, Martin Herbordt, Ang Li, Dingwen Tao

Graph Neural Networks (GNNs) have drawn tremendous attention due to their unique capability to extend Machine Learning (ML) approaches to applications broadly-defined as having unstructured data, especially graphs.

BIG-bench Machine Learning

Paper
Add Code

GAAF: Searching Activation Functions for Binary Neural Networks through Genetic Algorithm

1 code implementation • 5 Jun 2022 • Yanfei Li, Tong Geng, Samuel Stein, Ang Li, Huimin Yu

To close the accuracy gap, in this paper we propose to add a complementary activation function (AF) ahead of the sign based binarization, and rely on the genetic algorithm (GA) to automatically search for the ideal AFs.

Binarization

Paper
Code

I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement through Islandization

no code implementations • 7 Mar 2022 • Tong Geng, Chunshu Wu, Yongan Zhang, Cheng Tan, Chenhao Xie, Haoran You, Martin C. Herbordt, Yingyan Lin, Ang Li

In this paper we propose a novel hardware accelerator for GCN inference, called I-GCN, that significantly improves data locality and reduces unnecessary computation.

Paper
Add Code

GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design

1 code implementation • 22 Dec 2021 • Haoran You, Tong Geng, Yongan Zhang, Ang Li, Yingyan Lin

This is because real-world graphs can be extremely large and sparse.

Graph Learning

Paper
Code

G-CoS: GNN-Accelerator Co-Search Towards Both Better Accuracy and Efficiency

no code implementations • 18 Sep 2021 • Yongan Zhang, Haoran You, Yonggan Fu, Tong Geng, Ang Li, Yingyan Lin

While end-to-end jointly optimizing GNNs and their accelerators is promising in boosting GNNs' inference efficiency and expediting the design process, it is still underexplored due to the vast and distinct design spaces of GNNs and their accelerators.

Paper
Add Code

Binary Complex Neural Network Acceleration on FPGA

no code implementations • 10 Aug 2021 • Hongwu Peng, Shanglin Zhou, Scott Weitze, Jiaxin Li, Sahidul Islam, Tong Geng, Ang Li, Wei zhang, Minghu Song, Mimi Xie, Hang Liu, Caiwen Ding

Deep complex networks (DCN), in contrast, can learn from complex data, but have high computational costs; therefore, they cannot satisfy the instant decision-making requirements of many deployable systems dealing with short observations or short signal bursts.

Decision Making

Paper
Add Code

APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores

1 code implementation • 23 Jun 2021 • Boyuan Feng, yuke wang, Tong Geng, Ang Li, Yufei Ding

Over the years, accelerating neural networks with quantization has been widely studied.

Quantization

Paper
Code

BCNN: Binary Complex Neural Network

no code implementations • 28 Mar 2021 • Yanfei Li, Tong Geng, Ang Li, Huimin Yu

Motivated by the complex neural networks, in this paper we introduce complex representation into the BNNs and propose Binary complex neural network -- a novel network design that processes binary complex inputs and weights through complex convolution, but still can harvest the extraordinary computation efficiency of BNNs.

Paper
Add Code

Comparison Lift: Bandit-based Experimentation System for Online Advertising

no code implementations • 16 Sep 2020 • Tong Geng, Xiliang Lin, Harikesh S. Nair, Jun Hao, Bin Xiang, Shurui Fan

Second, by adapting experimental design to information acquired during the test, it reduces substantially the cost of experimentation to the advertiser.

Experimental Design

Paper
Add Code

AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing

1 code implementation • 23 Aug 2019 • Tong Geng, Ang Li, Runbin Shi, Chunshu Wu, Tianqi Wang, Yanfei Li, Pouya Haghi, Antonino Tumeo, Shuai Che, Steve Reinhardt, Martin Herbordt

Deep learning systems have been successfully applied to Euclidean data such as images, video, and audio.

Paper
Code

Online Evaluation of Audiences for Targeted Advertising via Bandit Experiments

no code implementations • 4 Jul 2019 • Tong Geng, Xiliang Lin, Harikesh S. Nair

The product is currently deployed on the advertising platform of JD. com, an eCommerce company and a publisher of digital ads in China.

Paper
Add Code

FPDeep: Scalable Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters

no code implementations • 4 Jan 2019 • Tong Geng, Tianqi Wang, Ang Li, Xi Jin, Martin Herbordt

Among the issues with this approach is that to make the distributed cluster work with high utilization, the workload distributed to each node must be large, which implies nontrivial growth in the SGD mini-batch size.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.