Search Results for author: Yuhao Chen

Found 37 papers, 4 papers with code

AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment

no code implementations • 7 Apr 2024 • Yuanfeng Xu, Yuhao Chen, Zhongzhan Huang, Zijian He, Guangrun Wang, Philip Torr, Liang Lin

In this paper, we present AnimateZoo, a zero-shot diffusion-based video generator to address this challenging cross-species animation issue, aiming to accurately produce animal animations while preserving the background.

Video Editing Video Generation

Paper
Add Code

Domain-Guided Masked Autoencoders for Unique Player Identification

no code implementations • 17 Mar 2024 • Bavesh Balaji, Jerrin Bright, Sirisha Rambhatla, Yuhao Chen, Alexander Wong, John Zelek, David A Clausi

We further introduce a new spatio-temporal network leveraging our novel d-MAE for unique player identification.

Sports Analytics

Paper
Add Code

Distribution and Depth-Aware Transformers for 3D Human Mesh Recovery

no code implementations • 14 Mar 2024 • Jerrin Bright, Bavesh Balaji, Harish Prakash, Yuhao Chen, David A Clausi, John Zelek

Precise Human Mesh Recovery (HMR) with in-the-wild data is a formidable challenge and is often hindered by depth ambiguities and reduced precision.

Human Mesh Recovery

Paper
Add Code

Robust Analysis of Multi-Task Learning Efficiency: New Benchmarks on Light-Weighed Backbones and Effective Measurement of Multi-Task Learning Challenges by Feature Disentanglement

no code implementations • 5 Feb 2024 • Dayou Mao, Yuhao Chen, Yifan Wu, Maximilian Gilles, Alexander Wong

One of the main motivations of MTL is to develop neural networks capable of inferring multiple tasks simultaneously.

Disentanglement Multi-Task Learning +1

Paper
Add Code

Assessing the Impact of Prompting Methods on ChatGPT's Mathematical Capabilities

no code implementations • 22 Dec 2023 • Yuhao Chen, Chloe Wong, Hanwen Yang, Juan Aguenza, Sai Bhujangari, Benthan Vu, Xun Lei, Amisha Prasad, Manny Fluss, Eric Phuong, Minghao Liu, Raja Kumar, Vanshika Vats, James Davis

This study critically evaluates the efficacy of prompting methods in enhancing the mathematical reasoning capability of large language models (LLMs).

Chatbot GSM8K +4

Paper
Add Code

NutritionVerse-Synth: An Open Access Synthetically Generated 2D Food Scene Dataset for Dietary Intake Estimation

no code implementations • 11 Dec 2023 • Saeejith Nair, Chi-en Amy Tai, Yuhao Chen, Alexander Wong

As the largest open-source synthetic food dataset, NV-Synth highlights the value of physics-based simulations for enabling scalable and controllable generation of diverse photorealistic meal images to overcome data limitations and drive advancements in automated dietary assessment using computer vision.

Paper
Add Code

FoodFusion: A Latent Diffusion Model for Realistic Food Image Generation

no code implementations • 6 Dec 2023 • Olivia Markham, Yuhao Chen, Chi-en Amy Tai, Alexander Wong

To address these limitations, we introduce FoodFusion, a Latent Diffusion model engineered specifically for the faithful synthesis of realistic food images from textual descriptions.

Image Generation

Paper
Add Code

Cancer-Net PCa-Gen: Synthesis of Realistic Prostate Diffusion Weighted Imaging Data via Anatomic-Conditional Controlled Latent Diffusion

no code implementations • 30 Nov 2023 • Aditya Sridhar, Chi-en Amy Tai, Hayden Gunraj, Yuhao Chen, Alexander Wong

In Canada, prostate cancer is the most common form of cancer in men and accounted for 20% of new cancer cases for this demographic in 2022.

Paper
Add Code

HiDiffusion: Unlocking High-Resolution Creativity and Efficiency in Low-Resolution Trained Diffusion Models

no code implementations • 29 Nov 2023 • Shen Zhang, Zhaowei Chen, Zhenyu Zhao, Zhenyuan Chen, Yao Tang, Yuhao Chen, Wengang Cao, Jiajun Liang

We introduce HiDiffusion, a tuning-free framework comprised of Resolution-Aware U-Net (RAU-Net) and Modified Shifted Window Multi-head Self-Attention (MSW-MSA) to enable pretrained large text-to-image diffusion models to efficiently generate high-resolution images (e. g. 1024$\times$1024) that surpass the training image resolution.

Attribute Image Generation

Paper
Add Code

Confidant: Customizing Transformer-based LLMs via Collaborative Edge Training

no code implementations • 22 Nov 2023 • Yuhao Chen, Yuxuan Yan, Qianqian Yang, Yuanchao Shu, Shibo He, Jiming Chen

Transformer-based large language models (LLMs) have demonstrated impressive capabilities in a variety of natural language processing (NLP) tasks.

Paper
Add Code

NutritionVerse-Real: An Open Access Manually Collected 2D Food Scene Dataset for Dietary Intake Estimation

no code implementations • 20 Nov 2023 • Chi-en Amy Tai, Saeejith Nair, Olivia Markham, Matthew Keller, Yifan Wu, Yuhao Chen, Alexander Wong

Dietary intake estimation plays a crucial role in understanding the nutritional habits of individuals and populations, aiding in the prevention and management of diet-related health issues.

Management

Paper
Add Code

AccEPT: An Acceleration Scheme for Speeding Up Edge Pipeline-parallel Training

no code implementations • 10 Nov 2023 • Yuhao Chen, Yuxuan Yan, Qianqian Yang, Yuanchao Shu, Shibo He, Zhiguo Shi, Jiming Chen

Moreover, we propose a bit-level computation-efficient data compression scheme to compress the data to be transmitted between devices during training.

Data Compression

Paper
Add Code

NAS-NeRF: Generative Neural Architecture Search for Neural Radiance Fields

no code implementations • 25 Sep 2023 • Saeejith Nair, Yuhao Chen, Mohammad Javad Shafiee, Alexander Wong

Thus, there is a need to dynamically optimize the neural network component of NeRFs to achieve a balance between computational complexity and specific targets for synthesis quality.

Neural Architecture Search Novel View Synthesis +1

Paper
Add Code

NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches

no code implementations • 14 Sep 2023 • Chi-en Amy Tai, Matthew Keller, Saeejith Nair, Yuhao Chen, Yifan Wu, Olivia Markham, Krish Parmar, Pengcheng Xi, Heather Keller, Sharon Kirkpatrick, Alexander Wong

Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images, but the lack of comprehensive datasets with diverse viewpoints, modalities and food annotations hinders the accuracy and realism of such methods.

Paper
Add Code

Jersey Number Recognition using Keyframe Identification from Low-Resolution Broadcast Videos

no code implementations • 12 Sep 2023 • Bavesh Balaji, Jerrin Bright, Harish Prakash, Yuhao Chen, David A Clausi, John Zelek

To address these issues, we propose a robust keyframe identification module that extracts frames containing essential high-level information about the jersey number.

Paper
Add Code

Mitigating Motion Blur for Robust 3D Baseball Player Pose Modeling for Pitch Analysis

no code implementations • 2 Sep 2023 • Jerrin Bright, Yuhao Chen, John Zelek

The findings highlight the effectiveness of our method in mitigating the challenges posed by motion blur, thereby enhancing the overall quality of pose estimation.

3D Pose Estimation Data Augmentation +1

Paper
Add Code

The Model Inversion Eavesdropping Attack in Semantic Communication Systems

no code implementations • 8 Aug 2023 • Yuhao Chen, Qianqian Yang, Zhiguo Shi, Jiming Chen

In recent years, semantic communication has been a popular research topic for its superiority in communication efficiency.

Paper
Add Code

Transferring Knowledge for Food Image Segmentation using Transformers and Convolutions

no code implementations • 15 Jun 2023 • Grant Sinha, Krish Parmar, Hilda Azimi, Amy Tai, Yuhao Chen, Alexander Wong, Pengcheng Xi

To address these issues, two models are trained and compared, one based on convolutional neural networks and the other on Bidirectional Encoder representation for Image Transformers (BEiT).

Image Segmentation Segmentation +1

Paper
Add Code

Deep Joint Source-Channel Coding for Wireless Image Transmission with Entropy-Aware Adaptive Rate Control

no code implementations • 5 Jun 2023 • Weixuan Chen, Yuhao Chen, Qianqian Yang, Chongwen Huang, Qian Wang, Zhaoyang Zhang

Adaptive rate control for deep joint source and channel coding (JSCC) is considered as an effective approach to transmit sufficient information in scenarios with limited communication resources.

Paper
Add Code

Fast GraspNeXt: A Fast Self-Attention Neural Network Architecture for Multi-task Learning in Computer Vision Tasks for Robotic Grasping on the Edge

no code implementations • 21 Apr 2023 • Alexander Wong, Yifan Wu, Saad Abbasi, Saeejith Nair, Yuhao Chen, Mohammad Javad Shafiee

As such, the design of highly efficient multi-task deep neural network architectures tailored for computer vision tasks for robotic grasping on the edge is highly desired for widespread adoption in manufacturing environments.

Multi-Task Learning Robotic Grasping

Paper
Add Code

NutritionVerse-3D: A 3D Food Model Dataset for Nutritional Intake Estimation

no code implementations • 12 Apr 2023 • Chi-en Amy Tai, Matthew Keller, Mattie Kerrigan, Yuhao Chen, Saeejith Nair, Pengcheng Xi, Alexander Wong

Unlike existing datasets, a collection of 3D models with nutritional information allow for view synthesis to create an infinite number of 2D images for any given viewpoint/camera angle along with the associated nutritional information.

Nutrition

Paper
Add Code

NutritionVerse-Thin: An Optimized Strategy for Enabling Improved Rendering of 3D Thin Food Models

no code implementations • 12 Apr 2023 • Chi-en Amy Tai, Jason Li, Sriram Kumar, Saeejith Nair, Yuhao Chen, Pengcheng Xi, Alexander Wong

With the growth in capabilities of generative models, there has been growing interest in using photo-realistic renders of common 3D food items to improve downstream tasks such as food printing, nutrition prediction, or management of food wastage.

Management Nutrition

Paper
Add Code

ShapeShift: Superquadric-based Object Pose Estimation for Robotic Grasping

no code implementations • 10 Apr 2023 • E. Zhixuan Zeng, Yuhao Chen, Alexander Wong

To address these challenges, this paper proposes ShapeShift, a superquadric-based framework for object pose estimation that predicts the object's pose relative to a primitive shape which is fitted to the object.

Object Pose Estimation +1

Paper
Add Code

Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data

1 code implementation • CVPR 2023 • Yuhao Chen, Xin Tan, Borui Zhao, Zhaowei Chen, RenJie Song, Jiajun Liang, Xuequan Lu

ANL introduces the additional negative pseudo-label for all unlabeled data to leverage low-confidence examples.

Pseudo Label

Paper
Code

MMRNet: Improving Reliability for Multimodal Object Detection and Segmentation for Bin Picking via Multimodal Redundancy

no code implementations • 19 Oct 2022 • Yuhao Chen, Hayden Gunraj, E. Zhixuan Zeng, Robbie Meyer, Maximilian Gilles, Alexander Wong

We also demonstrate that our MC score is a more reliability indicator for outputs during inference time compared to the model generated confidence scores that are often over-confident.

Ensemble Learning object-detection +1

Paper
Add Code

MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware Ambidextrous Bin Picking via Physics-based Metaverse Synthesis

no code implementations • 8 Aug 2022 • Maximilian Gilles, Yuhao Chen, Tim Robin Winter, E. Zhixuan Zeng, Alexander Wong

Autonomous bin picking poses significant challenges to vision-driven robotic systems given the complexity of the problem, ranging from various sensor modalities, to highly entangled object layouts, to diverse item properties and gripper types.

Keypoint Detection Object +2

Paper
Add Code

Demo: low-power communications based on RIS and AI for 6G

no code implementations • 21 May 2022 • Mingyao Cui, Zidong Wu, Yuhao Chen, Shenheng Xu, Fan Yang, Linglong Dai

By jointly designing the hardware and software, this prototype can realize real-time 4K video transmission with much reduced power consumption.

Paper
Add Code

MetaGraspNet_v0: A Large-Scale Benchmark Dataset for Vision-driven Robotic Grasping via Physics-based Metaverse Synthesis

1 code implementation • 29 Dec 2021 • Yuhao Chen, E. Zhixuan Zeng, Maximilian Gilles, Alexander Wong

We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance in a manner that is more appropriate for robotic grasp applications compared to existing general-purpose performance metrics.

Object object-detection +3

Paper
Code

FTPipeHD: A Fault-Tolerant Pipeline-Parallel Distributed Training Framework for Heterogeneous Edge Devices

no code implementations • 6 Oct 2021 • Yuhao Chen, Qianqian Yang, Shibo He, Zhiguo Shi, Jiming Chen

Our numerical results demonstrate that FTPipeHD is 6. 8x faster in training than the state of the art method when the computing capacity of the best device is 10x greater than the worst one.

Paper
Add Code

Low Resolution Information Also Matters: Learning Multi-Resolution Representations for Person Re-Identification

no code implementations • 26 May 2021 • Guoqing Zhang, Yuhao Chen, Weisi Lin, Arun Chandran, Xuan Jing

As a prevailing task in video surveillance and forensics field, person re-identification (re-ID) aims to match person images captured from non-overlapped cameras.

Person Re-Identification Super-Resolution +1

Paper
Add Code

TIPCB: A Simple but Effective Part-based Convolutional Baseline for Text-based Person Search

1 code implementation • 25 May 2021 • Yuhao Chen, Guoqing Zhang, Yujiang Lu, zhenxing Wang, yuhui Zheng, Ruili Wang

Text-based person search is a sub-task in the field of image retrieval, which aims to retrieve target person images according to a given textual description.

Ranked #11 on Text based Person Retrieval on CUHK-PEDES

Image Retrieval Person Search +3

Paper
Code

Reference-Aided Part-Aligned Feature Disentangling for Video Person Re-Identification

no code implementations • 21 Mar 2021 • Guoqing Zhang, Yuhao Chen, Yang Dai, yuhui Zheng, Yi Wu

Due to the inaccurate person detections and pose changes, pedestrian misalignment significantly increases the difficulty of feature extraction and matching.

Video-Based Person Re-Identification

Paper
Add Code

Quantization in Relative Gradient Angle Domain For Building Polygon Estimation

no code implementations • 10 Jul 2020 • Yuhao Chen, Yifan Wu, Linlin Xu, Alexander Wong

In this paper, we leverage the performance of CNNs, and propose a module that uses prior knowledge of building corners to create angular and concise building polygons from CNN segmentation outputs.

Quantization

Paper
Add Code

Plant Stem Segmentation Using Fast Ground Truth Generation

no code implementations • 24 Jan 2020 • Changye Yang, Sriram Baireddy, Yuhao Chen, Enyu Cai, Denise Caldwell, Valérian Méline, Anjali S. Iyer-Pascuzzi, Edward J. Delp

Analysis of the shape of plants can potentially be used to accurately quantify the degree of wilting.

Paper
Add Code

A Voice Interactive Multilingual Student Support System using IBM Watson

no code implementations • 20 Dec 2019 • Kennedy Ralston, Yuhao Chen, Haruna Isah, Farhana Zulkernine

The chatbot could also be adapted for use in other application areas such as student info-centers, government kiosks, and mental health support systems.

Chatbot