Search Results for author: Yuhao Chen

Found 37 papers, 4 papers with code

AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment

no code implementations7 Apr 2024 Yuanfeng Xu, Yuhao Chen, Zhongzhan Huang, Zijian He, Guangrun Wang, Philip Torr, Liang Lin

In this paper, we present AnimateZoo, a zero-shot diffusion-based video generator to address this challenging cross-species animation issue, aiming to accurately produce animal animations while preserving the background.

Video Editing Video Generation

Domain-Guided Masked Autoencoders for Unique Player Identification

no code implementations17 Mar 2024 Bavesh Balaji, Jerrin Bright, Sirisha Rambhatla, Yuhao Chen, Alexander Wong, John Zelek, David A Clausi

We further introduce a new spatio-temporal network leveraging our novel d-MAE for unique player identification.

Sports Analytics

Distribution and Depth-Aware Transformers for 3D Human Mesh Recovery

no code implementations14 Mar 2024 Jerrin Bright, Bavesh Balaji, Harish Prakash, Yuhao Chen, David A Clausi, John Zelek

Precise Human Mesh Recovery (HMR) with in-the-wild data is a formidable challenge and is often hindered by depth ambiguities and reduced precision.

Human Mesh Recovery

Assessing the Impact of Prompting Methods on ChatGPT's Mathematical Capabilities

no code implementations22 Dec 2023 Yuhao Chen, Chloe Wong, Hanwen Yang, Juan Aguenza, Sai Bhujangari, Benthan Vu, Xun Lei, Amisha Prasad, Manny Fluss, Eric Phuong, Minghao Liu, Raja Kumar, Vanshika Vats, James Davis

This study critically evaluates the efficacy of prompting methods in enhancing the mathematical reasoning capability of large language models (LLMs).

Chatbot GSM8K +4

NutritionVerse-Synth: An Open Access Synthetically Generated 2D Food Scene Dataset for Dietary Intake Estimation

no code implementations11 Dec 2023 Saeejith Nair, Chi-en Amy Tai, Yuhao Chen, Alexander Wong

As the largest open-source synthetic food dataset, NV-Synth highlights the value of physics-based simulations for enabling scalable and controllable generation of diverse photorealistic meal images to overcome data limitations and drive advancements in automated dietary assessment using computer vision.

FoodFusion: A Latent Diffusion Model for Realistic Food Image Generation

no code implementations6 Dec 2023 Olivia Markham, Yuhao Chen, Chi-en Amy Tai, Alexander Wong

To address these limitations, we introduce FoodFusion, a Latent Diffusion model engineered specifically for the faithful synthesis of realistic food images from textual descriptions.

Image Generation

Cancer-Net PCa-Gen: Synthesis of Realistic Prostate Diffusion Weighted Imaging Data via Anatomic-Conditional Controlled Latent Diffusion

no code implementations30 Nov 2023 Aditya Sridhar, Chi-en Amy Tai, Hayden Gunraj, Yuhao Chen, Alexander Wong

In Canada, prostate cancer is the most common form of cancer in men and accounted for 20% of new cancer cases for this demographic in 2022.

HiDiffusion: Unlocking High-Resolution Creativity and Efficiency in Low-Resolution Trained Diffusion Models

no code implementations29 Nov 2023 Shen Zhang, Zhaowei Chen, Zhenyu Zhao, Zhenyuan Chen, Yao Tang, Yuhao Chen, Wengang Cao, Jiajun Liang

We introduce HiDiffusion, a tuning-free framework comprised of Resolution-Aware U-Net (RAU-Net) and Modified Shifted Window Multi-head Self-Attention (MSW-MSA) to enable pretrained large text-to-image diffusion models to efficiently generate high-resolution images (e. g. 1024$\times$1024) that surpass the training image resolution.

Attribute Image Generation

Confidant: Customizing Transformer-based LLMs via Collaborative Edge Training

no code implementations22 Nov 2023 Yuhao Chen, Yuxuan Yan, Qianqian Yang, Yuanchao Shu, Shibo He, Jiming Chen

Transformer-based large language models (LLMs) have demonstrated impressive capabilities in a variety of natural language processing (NLP) tasks.

NutritionVerse-Real: An Open Access Manually Collected 2D Food Scene Dataset for Dietary Intake Estimation

no code implementations20 Nov 2023 Chi-en Amy Tai, Saeejith Nair, Olivia Markham, Matthew Keller, Yifan Wu, Yuhao Chen, Alexander Wong

Dietary intake estimation plays a crucial role in understanding the nutritional habits of individuals and populations, aiding in the prevention and management of diet-related health issues.

Management

AccEPT: An Acceleration Scheme for Speeding Up Edge Pipeline-parallel Training

no code implementations10 Nov 2023 Yuhao Chen, Yuxuan Yan, Qianqian Yang, Yuanchao Shu, Shibo He, Zhiguo Shi, Jiming Chen

Moreover, we propose a bit-level computation-efficient data compression scheme to compress the data to be transmitted between devices during training.

Data Compression

NAS-NeRF: Generative Neural Architecture Search for Neural Radiance Fields

no code implementations25 Sep 2023 Saeejith Nair, Yuhao Chen, Mohammad Javad Shafiee, Alexander Wong

Thus, there is a need to dynamically optimize the neural network component of NeRFs to achieve a balance between computational complexity and specific targets for synthesis quality.

Neural Architecture Search Novel View Synthesis +1

NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches

no code implementations14 Sep 2023 Chi-en Amy Tai, Matthew Keller, Saeejith Nair, Yuhao Chen, Yifan Wu, Olivia Markham, Krish Parmar, Pengcheng Xi, Heather Keller, Sharon Kirkpatrick, Alexander Wong

Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images, but the lack of comprehensive datasets with diverse viewpoints, modalities and food annotations hinders the accuracy and realism of such methods.

Jersey Number Recognition using Keyframe Identification from Low-Resolution Broadcast Videos

no code implementations12 Sep 2023 Bavesh Balaji, Jerrin Bright, Harish Prakash, Yuhao Chen, David A Clausi, John Zelek

To address these issues, we propose a robust keyframe identification module that extracts frames containing essential high-level information about the jersey number.

Mitigating Motion Blur for Robust 3D Baseball Player Pose Modeling for Pitch Analysis

no code implementations2 Sep 2023 Jerrin Bright, Yuhao Chen, John Zelek

The findings highlight the effectiveness of our method in mitigating the challenges posed by motion blur, thereby enhancing the overall quality of pose estimation.

3D Pose Estimation Data Augmentation +1

The Model Inversion Eavesdropping Attack in Semantic Communication Systems

no code implementations8 Aug 2023 Yuhao Chen, Qianqian Yang, Zhiguo Shi, Jiming Chen

In recent years, semantic communication has been a popular research topic for its superiority in communication efficiency.

Transferring Knowledge for Food Image Segmentation using Transformers and Convolutions

no code implementations15 Jun 2023 Grant Sinha, Krish Parmar, Hilda Azimi, Amy Tai, Yuhao Chen, Alexander Wong, Pengcheng Xi

To address these issues, two models are trained and compared, one based on convolutional neural networks and the other on Bidirectional Encoder representation for Image Transformers (BEiT).

Image Segmentation Segmentation +1

Deep Joint Source-Channel Coding for Wireless Image Transmission with Entropy-Aware Adaptive Rate Control

no code implementations5 Jun 2023 Weixuan Chen, Yuhao Chen, Qianqian Yang, Chongwen Huang, Qian Wang, Zhaoyang Zhang

Adaptive rate control for deep joint source and channel coding (JSCC) is considered as an effective approach to transmit sufficient information in scenarios with limited communication resources.

Fast GraspNeXt: A Fast Self-Attention Neural Network Architecture for Multi-task Learning in Computer Vision Tasks for Robotic Grasping on the Edge

no code implementations21 Apr 2023 Alexander Wong, Yifan Wu, Saad Abbasi, Saeejith Nair, Yuhao Chen, Mohammad Javad Shafiee

As such, the design of highly efficient multi-task deep neural network architectures tailored for computer vision tasks for robotic grasping on the edge is highly desired for widespread adoption in manufacturing environments.

Multi-Task Learning Robotic Grasping

NutritionVerse-3D: A 3D Food Model Dataset for Nutritional Intake Estimation

no code implementations12 Apr 2023 Chi-en Amy Tai, Matthew Keller, Mattie Kerrigan, Yuhao Chen, Saeejith Nair, Pengcheng Xi, Alexander Wong

Unlike existing datasets, a collection of 3D models with nutritional information allow for view synthesis to create an infinite number of 2D images for any given viewpoint/camera angle along with the associated nutritional information.

Nutrition

NutritionVerse-Thin: An Optimized Strategy for Enabling Improved Rendering of 3D Thin Food Models

no code implementations12 Apr 2023 Chi-en Amy Tai, Jason Li, Sriram Kumar, Saeejith Nair, Yuhao Chen, Pengcheng Xi, Alexander Wong

With the growth in capabilities of generative models, there has been growing interest in using photo-realistic renders of common 3D food items to improve downstream tasks such as food printing, nutrition prediction, or management of food wastage.

Management Nutrition

ShapeShift: Superquadric-based Object Pose Estimation for Robotic Grasping

no code implementations10 Apr 2023 E. Zhixuan Zeng, Yuhao Chen, Alexander Wong

To address these challenges, this paper proposes ShapeShift, a superquadric-based framework for object pose estimation that predicts the object's pose relative to a primitive shape which is fitted to the object.

Object Pose Estimation +1

Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data

1 code implementation CVPR 2023 Yuhao Chen, Xin Tan, Borui Zhao, Zhaowei Chen, RenJie Song, Jiajun Liang, Xuequan Lu

ANL introduces the additional negative pseudo-label for all unlabeled data to leverage low-confidence examples.

Pseudo Label

MMRNet: Improving Reliability for Multimodal Object Detection and Segmentation for Bin Picking via Multimodal Redundancy

no code implementations19 Oct 2022 Yuhao Chen, Hayden Gunraj, E. Zhixuan Zeng, Robbie Meyer, Maximilian Gilles, Alexander Wong

We also demonstrate that our MC score is a more reliability indicator for outputs during inference time compared to the model generated confidence scores that are often over-confident.

Ensemble Learning object-detection +1

MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware Ambidextrous Bin Picking via Physics-based Metaverse Synthesis

no code implementations8 Aug 2022 Maximilian Gilles, Yuhao Chen, Tim Robin Winter, E. Zhixuan Zeng, Alexander Wong

Autonomous bin picking poses significant challenges to vision-driven robotic systems given the complexity of the problem, ranging from various sensor modalities, to highly entangled object layouts, to diverse item properties and gripper types.

Keypoint Detection Object +2

Demo: low-power communications based on RIS and AI for 6G

no code implementations21 May 2022 Mingyao Cui, Zidong Wu, Yuhao Chen, Shenheng Xu, Fan Yang, Linglong Dai

By jointly designing the hardware and software, this prototype can realize real-time 4K video transmission with much reduced power consumption.

4k

MetaGraspNet_v0: A Large-Scale Benchmark Dataset for Vision-driven Robotic Grasping via Physics-based Metaverse Synthesis

1 code implementation29 Dec 2021 Yuhao Chen, E. Zhixuan Zeng, Maximilian Gilles, Alexander Wong

We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance in a manner that is more appropriate for robotic grasp applications compared to existing general-purpose performance metrics.

Object object-detection +3

FTPipeHD: A Fault-Tolerant Pipeline-Parallel Distributed Training Framework for Heterogeneous Edge Devices

no code implementations6 Oct 2021 Yuhao Chen, Qianqian Yang, Shibo He, Zhiguo Shi, Jiming Chen

Our numerical results demonstrate that FTPipeHD is 6. 8x faster in training than the state of the art method when the computing capacity of the best device is 10x greater than the worst one.

Low Resolution Information Also Matters: Learning Multi-Resolution Representations for Person Re-Identification

no code implementations26 May 2021 Guoqing Zhang, Yuhao Chen, Weisi Lin, Arun Chandran, Xuan Jing

As a prevailing task in video surveillance and forensics field, person re-identification (re-ID) aims to match person images captured from non-overlapped cameras.

Person Re-Identification Super-Resolution +1

TIPCB: A Simple but Effective Part-based Convolutional Baseline for Text-based Person Search

1 code implementation25 May 2021 Yuhao Chen, Guoqing Zhang, Yujiang Lu, zhenxing Wang, yuhui Zheng, Ruili Wang

Text-based person search is a sub-task in the field of image retrieval, which aims to retrieve target person images according to a given textual description.

Image Retrieval Person Search +3

Reference-Aided Part-Aligned Feature Disentangling for Video Person Re-Identification

no code implementations21 Mar 2021 Guoqing Zhang, Yuhao Chen, Yang Dai, yuhui Zheng, Yi Wu

Due to the inaccurate person detections and pose changes, pedestrian misalignment significantly increases the difficulty of feature extraction and matching.

Video-Based Person Re-Identification

Quantization in Relative Gradient Angle Domain For Building Polygon Estimation

no code implementations10 Jul 2020 Yuhao Chen, Yifan Wu, Linlin Xu, Alexander Wong

In this paper, we leverage the performance of CNNs, and propose a module that uses prior knowledge of building corners to create angular and concise building polygons from CNN segmentation outputs.

Quantization

A Voice Interactive Multilingual Student Support System using IBM Watson

no code implementations20 Dec 2019 Kennedy Ralston, Yuhao Chen, Haruna Isah, Farhana Zulkernine

The chatbot could also be adapted for use in other application areas such as student info-centers, government kiosks, and mental health support systems.

Chatbot

Locating Objects Without Bounding Boxes

6 code implementations CVPR 2019 Javier Ribera, David Güera, Yuhao Chen, Edward J. Delp

In these networks, the training procedure usually requires providing bounding boxes or the maximum number of expected objects.

Object Object Localization +1

Cannot find the paper you are looking for? You can Submit a new open access paper.