Search Results for author: Nicholas J. Bryan

Found 11 papers, 4 papers with code

DITTO: Diffusion Inference-Time T-Optimization for Music Generation

no code implementations • 22 Jan 2024 • Zachary Novack, Julian McAuley, Taylor Berg-Kirkpatrick, Nicholas J. Bryan

We propose Diffusion Inference-Time T-Optimization (DITTO), a general-purpose frame-work for controlling pre-trained text-to-music diffusion models at inference-time via optimizing initial noise latents.

Computational Efficiency Music Generation

Paper
Add Code

Emotion Embedding Spaces for Matching Music to Stories

1 code implementation • 26 Nov 2021 • Minz Won, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore, Xavier Serra

Content creators often use music to enhance their stories, as it can be a powerful tool to convey emotion.

Cross-Modal Retrieval Metric Learning +1

Paper
Code

Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet

1 code implementation • 5 Oct 2021 • Max Morrison, Zeyu Jin, Nicholas J. Bryan, Juan-Pablo Caceres, Bryan Pardo

Modifying the pitch and timing of an audio signal are fundamental audio editing operations with applications in speech manipulation, audio-visual synchronization, and singing voice editing and synthesis.

Audio-Visual Synchronization

138

Paper
Code

Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization

no code implementations • 28 Jul 2021 • Haici Yang, Shivani Firodiya, Nicholas J. Bryan, Minje Kim

In this work, we learn to remix music directly by re-purposing Conv-TasNet, a well-known source separation model, into two neural remixing architectures.

Data Augmentation

Paper
Add Code

Differentiable Signal Processing With Black-Box Audio Effects

2 code implementations • 11 May 2021 • Marco A. Martínez Ramírez, Oliver Wang, Paris Smaragdis, Nicholas J. Bryan

We present a data-driven approach to automate audio signal processing by incorporating stateful third-party, audio effects as layers within a deep neural network.

Audio Signal Processing

185

Paper
Code

Context-Aware Prosody Correction for Text-Based Speech Editing

no code implementations • 16 Feb 2021 • Max Morrison, Lucas Rencker, Zeyu Jin, Nicholas J. Bryan, Juan-Pablo Caceres, Bryan Pardo

Text-based speech editors expedite the process of editing speech recordings by permitting editing via intuitive cut, copy, and paste operations on a speech transcript.

Denoising

Paper
Add Code

Disentangled Multidimensional Metric Learning for Music Similarity

no code implementations • 9 Aug 2020 • Jongpil Lee, Nicholas J. Bryan, Justin Salamon, Zeyu Jin, Juhan Nam

For this task, it is typically necessary to define a similarity metric to compare one recording to another.

Metric Learning Specificity +1

Paper
Add Code

Metric Learning vs Classification for Disentangled Music Representation Learning

no code implementations • 9 Aug 2020 • Jongpil Lee, Nicholas J. Bryan, Justin Salamon, Zeyu Jin, Juhan Nam

For this, we (1) outline past work on the relationship between metric learning and classification, (2) extend this relationship to multi-label data by exploring three different learning approaches and their disentangled versions, and (3) evaluate all models on four tasks (training time, similarity retrieval, auto-tagging, and triplet prediction).

Classification Disentanglement +6

Paper
Add Code

Controllable Neural Prosody Synthesis

no code implementations • 7 Aug 2020 • Max Morrison, Zeyu Jin, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore

Speech synthesis has recently seen significant improvements in fidelity, driven by the advent of neural vocoders and neural prosody generators.

Speech Synthesis

Paper
Add Code

A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences

1 code implementation • 13 Jan 2020 • Pranay Manocha, Adam Finkelstein, Zeyu Jin, Nicholas J. Bryan, Richard Zhang, Gautham J. Mysore

Assessment of many audio processing tasks relies on subjective evaluation which is time-consuming and expensive.

Denoising Speech Enhancement

344

Paper
Code

Scene-Aware Audio Rendering via Deep Acoustic Analysis

no code implementations • 14 Nov 2019 • Zhenyu Tang, Nicholas J. Bryan, DIngzeyu Li, Timothy R. Langlois, Dinesh Manocha

We present a new method to capture the acoustic characteristics of real-world rooms using commodity devices, and use the captured characteristics to generate similar sounding sources with virtual models.

Sound Graphics Multimedia Audio and Speech Processing

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.