no code implementations • 6 Jun 2024 • Max Mueller-Eberstein, Dianna Yee, Karren Yang, Gautam Varma Mantena, Colin Lea
Parameter-efficient fine-tuning (PEFT) for personalizing automatic speech recognition (ASR) has recently shown promise for adapting general population models to atypical speech.
no code implementations • 8 Jun 2023 • Colin Lea, Dianna Yee, Jaya Narain, Zifang Huang, Lauren Tooley, Jeffrey P. Bigham, Leah Findlater
Many consumer speech recognition systems are not tuned for people with speech disabilities, resulting in poor recognition and user experience, especially for severe speech differences.
no code implementations • 15 Feb 2022 • Colin Lea, Zifang Huang, Dhruv Jain, Lauren Tooley, Zeinab Liaghat, Shrinath Thelapurath, Leah Findlater, Jeffrey P. Bigham
Voice assistants have become an essential tool for people with various disabilities because they enable complex phone- or tablet-based interactions without the need for fine-grained motor control, such as with touchscreens.
no code implementations • 18 Jun 2021 • Vikramjit Mitra, Zifang Huang, Colin Lea, Lauren Tooley, Sarah Wu, Darren Botten, Ashwini Palekar, Shrinath Thelapurath, Panayiotis Georgiou, Sachin Kajarekar, Jefferey Bigham
Dysfluencies and variations in speech pronunciation can severely degrade speech recognition performance, and for many individuals with moderate-to-severe speech disorders, voice operated systems do not work.
no code implementations • 24 Feb 2021 • Colin Lea, Vikramjit Mitra, Aparna Joshi, Sachin Kajarekar, Jeffrey P. Bigham
The ability to automatically detect stuttering events in speech could help speech pathologists track an individual's fluency over time or help improve speech recognition systems for people with atypical speech patterns.
no code implementations • 11 Aug 2020 • Alexander Richard, Colin Lea, Shugao Ma, Juergen Gall, Fernando de la Torre, Yaser Sheikh
Codec Avatars are a recent class of learned, photorealistic face models that accurately represent the geometry and texture of a person in 3D (i. e., for virtual reality), and are almost indistinguishable from video.
5 code implementations • CVPR 2017 • Colin Lea, Michael D. Flynn, Rene Vidal, Austin Reiter, Gregory D. Hager
The ability to identify and temporally segment fine-grained human actions throughout a video is crucial for robotics, surveillance, education, and beyond.
1 code implementation • 29 Aug 2016 • Colin Lea, Rene Vidal, Austin Reiter, Gregory D. Hager
The dominant paradigm for video-based action segmentation is composed of two steps: first, for each frame, compute low-level features using Dense Trajectories or a Convolutional Neural Network that encode spatiotemporal information locally, and second, input these features into a classifier that captures high-level temporal relationships, such as a Recurrent Neural Network (RNN).
Ranked #6 on Action Segmentation on JIGSAWS
no code implementations • 8 Aug 2016 • William Gray Roncal, Colin Lea, Akira Baruah, Gregory D. Hager
Our automated approach improves the local subgraph score by more than four times and the full graph score by 60 percent.
3 code implementations • 20 Jun 2016 • Robert DiPietro, Colin Lea, Anand Malpani, Narges Ahmidi, S. Swaroop Vedula, Gyusung I. Lee, Mija R. Lee, Gregory D. Hager
In contrast, we work on recognizing both gestures and longer, higher-level activites, or maneuvers, and we model the mapping from kinematics to gestures/maneuvers with recurrent neural networks.
Ranked #1 on Surgical Skills Evaluation on MISTIC-SIL
no code implementations • 9 Feb 2016 • Colin Lea, Austin Reiter, Rene Vidal, Gregory D. Hager
We propose a model for action segmentation which combines low-level spatiotemporal features with a high-level segmental classifier.
Ranked #7 on Action Segmentation on JIGSAWS