Search Results for author: Yi Luo

Found 63 papers, 27 papers with code

Semantic-Rearrangement-Based Multi-Level Alignment for Domain Generalized Segmentation

no code implementations • 21 Apr 2024 • Guanlong Jiao, Chenyangguang Zhang, Haonan Yin, Yu Mo, Biqing Huang, Hui Pan, Yi Luo, Jingxian Liu

SRMA first incorporates a Semantic Rearrangement Module (SRM), which conducts semantic region randomization to enhance the diversity of the source domain sufficiently.

Paper
Add Code

Reasoning on Efficient Knowledge Paths:Knowledge Graph Guides Large Language Model for Domain Question Answering

no code implementations • 16 Apr 2024 • Yuqi Wang, Boran Jiang, Yi Luo, Dawei He, Peng Cheng, Liangcai Gao

Especially for the question that require a multi-hop reasoning path, frequent calls to LLM will consume a lot of computing power.

Hallucination Language Modelling +2

Paper
Add Code

Gull: A Generative Multifunctional Audio Codec

no code implementations • 7 Apr 2024 • Yi Luo, Jianwei Yu, Hangting Chen, Rongzhi Gu, Chao Weng

We introduce Gull, a generative multifunctional audio codec.

Audio Compression Audio Source Separation +3

Paper
Add Code

Ensuring Safe and High-Quality Outputs: A Guideline Library Approach for Language Models

1 code implementation • 18 Mar 2024 • Yi Luo, Zhenghao Lin, Yuhao Zhang, Jiashuo Sun, Chen Lin, Chengjin Xu, Xiangdong Su, Yelong Shen, Jian Guo, Yeyun Gong

Subsequently, the retrieval model correlates new inputs with relevant guidelines, which guide LLMs in response generation to ensure safe and high-quality outputs, thereby aligning with human values.

Response Generation Retrieval

Paper
Code

NewsBench: Systematic Evaluation of LLMs for Writing Proficiency and Safety Adherence in Chinese Journalistic Editorial Applications

no code implementations • 29 Feb 2024 • Miao Li, Ming-Bin Chen, Bo Tang, Shengbin Hou, Pengyu Wang, Haiying Deng, Zhiyu Li, Feiyu Xiong, Keming Mao, Peng Cheng, Yi Luo

This study presents NewsBench, a novel benchmark framework developed to evaluate the capability of Large Language Models (LLMs) in Chinese Journalistic Writing Proficiency (JWP) and their Safety Adherence (SA), addressing the gap between journalistic ethics and the risks associated with AI utilization.

Ethics

Paper
Add Code

Entire Chain Uplift Modeling with Context-Enhanced Learning for Intelligent Marketing

1 code implementation • 4 Feb 2024 • Yinqiu Huang, Shuli Wang, Min Gao, Xue Wei, Changhao Li, Chuan Luo, Yinhua Zhu, Xiong Xiao, Yi Luo

ECUP consists of two primary components: 1) the Entire Chain-Enhanced Network, which utilizes user behavior patterns to estimate ITE throughout the entire chain space, models the various impacts of treatments on each task, and integrates task prior information to enhance context awareness across all stages, capturing the impact of treatment on different tasks, and 2) the Treatment-Enhanced Network, which facilitates fine-grained treatment modeling through bit-level feature interactions, thereby enabling adaptive feature adjustment.

Marketing

Paper
Code

CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models

1 code implementation • 30 Jan 2024 • Yuanjie Lyu, Zhiyu Li, Simin Niu, Feiyu Xiong, Bo Tang, Wenjin Wang, Hao Wu, Huanyong Liu, Tong Xu, Enhong Chen, Yi Luo, Peng Cheng, Haiying Deng, Zhonghao Wang, Zijia Lu

For each of these CRUD categories, we have developed comprehensive datasets to evaluate the performance of RAG systems.

Question Answering Retrieval

112

Paper
Code

Subnetwork-to-go: Elastic Neural Network with Dynamic Training and Customizable Inference

no code implementations • 6 Dec 2023 • Kai Li, Yi Luo

Deploying neural networks to different devices or platforms is in general challenging, especially when the model size is large or model complexity is high.

Music Source Separation

Paper
Add Code

AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data

no code implementations • 25 Sep 2023 • Jianwei Yu, Hangting Chen, Yanyao Bian, Xiang Li, Yi Luo, Jinchuan Tian, Mengyang Liu, Jiayi Jiang, Shuai Wang

To address this issue, we introduce an automatic in-the-wild speech data preprocessing framework (AutoPrep) in this paper, which is designed to enhance speech quality, generate speaker labels, and produce transcriptions automatically.

Automatic Speech Recognition Speech Enhancement +3

Paper
Add Code

ReZero: Region-customizable Sound Extraction

no code implementations • 31 Aug 2023 • Rongzhi Gu, Yi Luo

Being a solution to the R-SE task, the proposed ReZero framework includes (1) definitions of different types of spatial regions, (2) methods for region feature extraction and aggregation, and (3) a multi-channel extension of the band-split RNN (BSRNN) model specified for the R-SE task.

Paper
Add Code

Ultra Dual-Path Compression For Joint Echo Cancellation And Noise Suppression

1 code implementation • 21 Aug 2023 • Hangting Chen, Jianwei Yu, Yi Luo, Rongzhi Gu, Weihua Li, Zhuocheng Lu, Chao Weng

Echo cancellation and noise reduction are essential for full-duplex communication, yet most existing neural networks have high computational costs and are inflexible in tuning model complexity.

Dimensionality Reduction

Paper
Code

The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track

1 code implementation • 14 Aug 2023 • Stefan Uhlich, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern, Jonathan Le Roux, Dipam Chakraborty, Sharada Mohanty, Kai Li, Yi Luo, Jianwei Yu, Rongzhi Gu, Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva, Mikhail Sukhovei, Yuki Mitsufuji

A significant source of this improvement was making the simulated data better match real cinematic audio, which we further investigate in detail.

Paper
Code

The Sound Demixing Challenge 2023 $\unicode{x2013}$ Music Demixing Track

2 code implementations • 14 Aug 2023 • Giorgio Fabbro, Stefan Uhlich, Chieh-Hsin Lai, Woosung Choi, Marco Martínez-Ramírez, WeiHsiang Liao, Igor Gadelha, Geraldo Ramos, Eddie Hsu, Hugo Rodrigues, Fabian-Robert Stöter, Alexandre Défossez, Yi Luo, Jianwei Yu, Dipam Chakraborty, Sharada Mohanty, Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva, Nabarun Goswami, Tatsuya Harada, Minseok Kim, Jun Hyung Lee, Yuanliang Dong, Xinran Zhang, Jiafeng Liu, Yuki Mitsufuji

We propose a formalization of the errors that can occur in the design of a training dataset for MSS systems and introduce two new datasets that simulate such errors: SDXDB23_LabelNoise and SDXDB23_Bleeding.

Music Source Separation

486

Paper
Code

LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts

no code implementations • ICCV 2023 • BinBin Yang, Yi Luo, Ziliang Chen, Guangrun Wang, Xiaodan Liang, Liang Lin

Thanks to the rapid development of diffusion models, unprecedented progress has been witnessed in image synthesis.

Layout-to-Image Generation Object +1

Paper
Add Code

Graph Entropy Minimization for Semi-supervised Node Classification

1 code implementation • 31 May 2023 • Yi Luo, Guangchun Luo, Ke Qin, Aiguo Chen

Node classifiers are required to comprehensively reduce prediction errors, training resources, and inference latency in the industry.

Ranked #9 on Node Classification on CiteSeer with Public Split: fixed 20 nodes per class

Classification Knowledge Distillation +1

Paper
Code

Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models

1 code implementation • 23 Apr 2023 • Jiashuo Sun, Yi Luo, Yeyun Gong, Chen Lin, Yelong Shen, Jian Guo, Nan Duan

By utilizing iterative bootstrapping, our approach enables LLMs to autonomously rectify errors, resulting in more precise and comprehensive reasoning chains.

Paper
Code

High Fidelity Speech Enhancement with Band-split RNN

1 code implementation • 1 Dec 2022 • Jianwei Yu, Yi Luo, Hangting Chen, Rongzhi Gu, Chao Weng

Despite the rapid progress in speech enhancement (SE) research, enhancing the quality of desired speech in environments with strong noise and interfering speakers remains challenging.

Speech Enhancement Vocal Bursts Intensity Prediction

Paper
Code

Unifying Label-inputted Graph Neural Networks with Deep Equilibrium Models

2 code implementations • 19 Nov 2022 • Yi Luo, Guiduo Duan, Guangchun Luo, Aiguo Chen

The unification facilitates the exchange between the two subdomains and inspires more studies.

Node Classification

Paper
Code

3D Matting: A Benchmark Study on Soft Segmentation Method for Pulmonary Nodules Applied in Computed Tomography

no code implementations • 11 Oct 2022 • Lin Wang, Xiufen Ye, Donghao Zhang, Wanji He, Lie Ju, Yi Luo, Huan Luo, Xin Wang, Wei Feng, Kaimin Song, Xin Zhao, ZongYuan Ge

In this work, we introduce the image matting into the 3D scenes and use the alpha matte, i. e., a soft mask, to describe lesions in a 3D medical image.

Binarization Image Matting

Paper
Add Code

Music Source Separation with Band-split RNN

4 code implementations • 30 Sep 2022 • Yi Luo, Jianwei Yu

The performance of music source separation (MSS) models has been greatly improved in recent years thanks to the development of novel neural network architectures and training pipelines.

Ranked #3 on Music Source Separation on MUSDB18 (using extra training data)

Music Source Separation

121

Paper
Code

Virtual impactor-based label-free bio-aerosol detection using holography and deep learning

no code implementations • 30 Aug 2022 • Yi Luo, Yijie Zhang, Tairan Liu, Alan Yu, Yichen Wu, Aydogan Ozcan

To address this need, we present a mobile and cost-effective label-free bio-aerosol sensor that takes holographic images of flowing particulate matter concentrated by a virtual impactor, which selectively slows down and guides particles larger than ~6 microns to fly through an imaging window.

Paper
Add Code

Massively Parallel Universal Linear Transformations using a Wavelength-Multiplexed Diffractive Optical Network

no code implementations • 13 Aug 2022 • Jingxi Li, Bijie Bai, Yi Luo, Aydogan Ozcan

We report deep learning-based design of a massively parallel broadband diffractive neural network for all-optically performing a large group of arbitrarily-selected, complex-valued linear transformations between an input and output field-of-view, each with N_i and N_o pixels, respectively.

Paper
Add Code

FRA-RIR: Fast Random Approximation of the Image-source Method

2 code implementations • 8 Aug 2022 • Yi Luo, Jianwei Yu

The training of modern speech processing systems often requires a large amount of simulated room impulse response (RIR) data in order to allow the systems to generalize well in real-world, reverberant environments.

Denoising Room Impulse Response (RIR) +1

155

Paper
Code

All-optical image classification through unknown random diffusers using a single-pixel diffractive network

no code implementations • 8 Aug 2022 • Yi Luo, Bijie Bai, Yuhang Li, Ege Cetintas, Aydogan Ozcan

Classification of an object behind a random and unknown scattering medium sets a challenging task for computational imaging and machine vision fields.

Autonomous Driving Image Classification +1

Paper
Add Code

Super-resolution image display using diffractive decoders

no code implementations • 15 Jun 2022 • Cagatay Isil, Deniz Mengu, Yifan Zhao, Anika Tabassum, Jingxi Li, Yi Luo, Mona Jarrahi, Aydogan Ozcan

We report a deep learning-enabled diffractive display design that is based on a jointly-trained pair of an electronic encoder and a diffractive optical decoder to synthesize/project super-resolved images using low-resolution wavefront modulators.

Super-Resolution

Paper
Add Code

To image, or not to image: Class-specific diffractive cameras with all-optical erasure of undesired objects

no code implementations • 26 May 2022 • Bijie Bai, Yi Luo, Tianyi Gan, Jingtian Hu, Yuhang Li, Yifan Zhao, Deniz Mengu, Mona Jarrahi, Aydogan Ozcan

Here, we demonstrate a camera design that performs class-specific imaging of target objects with instantaneous all-optical erasure of other classes of objects.

Privacy Preserving

Paper
Add Code

Analysis of Diffractive Neural Networks for Seeing Through Random Diffusers

no code implementations • 1 May 2022 • Yuhang Li, Yi Luo, Bijie Bai, Aydogan Ozcan

During its training, random diffusers with a range of correlation lengths were used to improve the diffractive network's generalization performance.

Autonomous Driving Image Reconstruction

Paper
Add Code

Inferring from References with Differences for Semi-Supervised Node Classification on Graphs

1 code implementation • Mathematics 2022 • Yi Luo, Guangchun Luo, Ke Yan, Aiguo Chen

Following the application of Deep Learning to graphic data, Graph Neural Networks (GNNs) have become the dominant method for Node Classification on graphs in recent years.

Ranked #1 on Node Classification on Amazon Photo

Node Classification

Paper
Code

Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization

1 code implementation • CVPR 2022 • Wei Dong, Junsheng Wu, Yi Luo, ZongYuan Ge, Peng Wang

In this work, we present a simple-yet-effective self-supervised node representation learning strategy via directly maximizing the mutual information between the hidden representations of nodes and their neighbourhood, which can be theoretically justified by its link to graph smoothing.

Node Classification Representation Learning

Paper
Code

A Time-domain Real-valued Generalized Wiener Filter for Multi-channel Neural Separation Systems

1 code implementation • 7 Dec 2021 • Yi Luo

Frequency-domain beamformers have been successful in a wide range of multi-channel neural separation systems in the past years.

Speech Separation

234

Paper
Code

Cascadable all-optical NAND gates using diffractive networks

no code implementations • 2 Nov 2021 • Yi Luo, Deniz Mengu, Aydogan Ozcan

Based on this architecture, we numerically optimized the design of a diffractive neural network composed of 4 passive layers to all-optically perform NAND operation using the diffraction of light, and cascaded these diffractive NAND gates to perform complex logical functions by successively feeding the output of one diffractive NAND gate into another.

Paper
Add Code

Distilling Self-Knowledge From Contrastive Links to Classify Graph Nodes Without Passing Messages

2 code implementations • 16 Jun 2021 • Yi Luo, Aiguo Chen, Ke Yan, Ling Tian

Nowadays, Graph Neural Networks (GNNs) following the Message Passing paradigm become the dominant way to learn on graphic data.

Ranked #1 on Node Classification on Cora Full

Node Classification Node Property Prediction

Paper
Code

Dynamic imaging and characterization of volatile aerosols in e-cigarette emissions using deep learning-based holographic microscopy

no code implementations • 31 Mar 2021 • Yi Luo, Yichen Wu, Liqiao Li, Yuening Guo, Ege Cetintas, Yifang Zhu, Aydogan Ozcan

To evaluate the effects of e-liquid composition on aerosol dynamics, we measured the volatility of the particles generated by flavorless, nicotine-free e-liquids with various PG/VG volumetric ratios, revealing a negative correlation between the particles' volatility and the volumetric ratio of VG in the e-liquid.

Paper
Add Code

Dual-Path Modeling for Long Recording Speech Separation in Meetings

no code implementations • 23 Feb 2021 • Chenda Li, Zhuo Chen, Yi Luo, Cong Han, Tianyan Zhou, Keisuke Kinoshita, Marc Delcroix, Shinji Watanabe, Yanmin Qian

A transformer-based dual-path system is proposed, which integrates transform layers for global modeling.

Speech Separation

Paper
Add Code

Holographic image reconstruction with phase recovery and autofocusing using recurrent neural networks

no code implementations • 12 Feb 2021 • Luzhe Huang, Tairan Liu, Xilin Yang, Yi Luo, Yair Rivenson, Aydogan Ozcan

Digital holography is one of the most widely used label-free microscopy techniques in biomedical imaging.

Image Reconstruction

Paper
Add Code

Memory-Associated Differential Learning

2 code implementations • 10 Feb 2021 • Yi Luo, Aiguo Chen, Bei Hui, Ke Yan

Conventional Supervised Learning approaches focus on the mapping from input features to output labels.

Ranked #1 on Link Property Prediction on ogbl-ddi

Link Prediction

Paper
Code

Continuous Speech Separation Using Speaker Inventory for Long Multi-talker Recording

no code implementations • 17 Dec 2020 • Cong Han, Yi Luo, Chenda Li, Tianyan Zhou, Keisuke Kinoshita, Shinji Watanabe, Marc Delcroix, Hakan Erdogan, John R. Hershey, Nima Mesgarani, Zhuo Chen

Leveraging additional speaker information to facilitate speech separation has received increasing attention in recent years.

Clustering Speech Separation

Paper
Add Code

Group Communication with Context Codec for Lightweight Source Separation

1 code implementation • 14 Dec 2020 • Yi Luo, Cong Han, Nima Mesgarani

A context codec module, containing a context encoder and a context decoder, is designed as a learnable downsampling and upsampling module to decrease the length of a sequential feature processed by the separation module.

Speech Enhancement Speech Separation

Paper
Code

Drone LAMS: A Drone-based Face Detection Dataset with Large Angles and Many Scenarios

no code implementations • 16 Nov 2020 • Yi Luo, Siyi Chen, X. -G. Ma

This work presented a new drone-based face detection dataset Drone LAMS in order to solve issues of low performance of drone-based face detection in scenarios such as large angles which was a predominant working condition when a drone flies high.

Face Detection

Paper
Add Code

Integration of speech separation, diarization, and recognition for multi-speaker meetings: System description, comparison, and analysis

no code implementations • 3 Nov 2020 • Desh Raj, Pavel Denisov, Zhuo Chen, Hakan Erdogan, Zili Huang, Maokui He, Shinji Watanabe, Jun Du, Takuya Yoshioka, Yi Luo, Naoyuki Kanda, Jinyu Li, Scott Wisdom, John R. Hershey

Multi-speaker speech recognition of unsegmented recordings has diverse applications such as meeting transcription and automatic subtitle generation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Integrated Gallium Nitride Nonlinear Photonics

no code implementations • 30 Oct 2020 • Yanzhen Zheng, Changzheng Sun, Bing Xiong, Lai Wang, Zhibiao Hao, Jian Wang, Yanjun Han, Hongtao Li, Jiadong Yu, Yi Luo

Thanks to its high nonlinearity and high refractive index contrast, GaN-on-insulator (GaNOI) is also a promising platform for nonlinear optical applications.

Optics Applied Physics

Paper
Add Code

An End-to-end Architecture of Online Multi-channel Speech Separation

no code implementations • 7 Sep 2020 • Jian Wu, Zhuo Chen, Jinyu Li, Takuya Yoshioka, Zhili Tan, Ed Lin, Yi Luo, Lei Xie

Previously, we introduced a sys-tem, calledunmixing, fixed-beamformerandextraction(UFE), that was shown to be effective in addressing the speech over-lap problem in conversation transcription.

speech-recognition Speech Recognition +1

Paper
Add Code

Deep learning-based holographic polarization microscopy

no code implementations • 1 Jul 2020 • Tairan Liu, Kevin de Haan, Bijie Bai, Yair Rivenson, Yi Luo, Hongda Wang, David Karalli, Hongxiang Fu, Yibo Zhang, John FitzGerald, Aydogan Ozcan

Our analysis shows that a trained deep neural network can extract the birefringence information using both the sample specific morphological features as well as the holographic amplitude and phase distribution.

Medical Diagnosis

Paper
Add Code

Terahertz Pulse Shaping Using Diffractive Surfaces

no code implementations • 30 Jun 2020 • Muhammed Veli, Deniz Mengu, Nezih T. Yardimci, Yi Luo, Jingxi Li, Yair Rivenson, Mona Jarrahi, Aydogan Ozcan

Recent advances in deep learning have been providing non-intuitive solutions to various inverse problems in optics.

Transfer Learning

Paper
Add Code

Spectrally-Encoded Single-Pixel Machine Vision Using Diffractive Networks

no code implementations • 15 May 2020 • Jingxi Li, Deniz Mengu, Nezih T. Yardimci, Yi Luo, Xurong Li, Muhammed Veli, Yair Rivenson, Mona Jarrahi, Aydogan Ozcan

3D engineering of matter has opened up new avenues for designing systems that can perform various computational tasks through light-matter interaction.

General Classification

Paper
Add Code

Separating Varying Numbers of Sources with Auxiliary Autoencoding Loss

no code implementations • 27 Mar 2020 • Yi Luo, Nima Mesgarani

Many recent source separation systems are designed to separate a fixed number of sources out of a mixture.

valid

Paper
Add Code

Continuous speech separation: dataset and analysis

1 code implementation • 30 Jan 2020 • Zhuo Chen, Takuya Yoshioka, Liang Lu, Tianyan Zhou, Zhong Meng, Yi Luo, Jian Wu, Xiong Xiao, Jinyu Li

In this paper, we define continuous speech separation (CSS) as a task of generating a set of non-overlapped speech signals from a \textit{continuous} audio stream that contains multiple utterances that are \emph{partially} overlapped by a varying degree.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

129

Paper
Code

End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation

2 code implementations • 30 Oct 2019 • Yi Luo, Zhuo Chen, Nima Mesgarani, Takuya Yoshioka

An important problem in ad-hoc microphone speech separation is how to guarantee the robustness of a system with respect to the locations and numbers of microphones.

Speech Separation

234

Paper
Code

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation

7 code implementations • 14 Oct 2019 • Yi Luo, Zhuo Chen, Takuya Yoshioka

Recent studies in deep learning-based speech separation have proven the superiority of time-domain approaches to conventional time-frequency-based methods.

Ranked #20 on Speech Separation on WSJ0-2mix

Speech Separation

2,105

Paper
Code

FaSNet: Low-latency Adaptive Beamforming for Multi-microphone Audio Processing

1 code implementation • 29 Sep 2019 • Yi Luo, Enea Ceolini, Cong Han, Shih-Chii Liu, Nima Mesgarani

Beamforming has been extensively investigated for multi-channel audio processing tasks.

Speech Enhancement speech-recognition +1

234

Paper
Code

Design of Task-Specific Optical Systems Using Broadband Diffractive Neural Networks

no code implementations • 14 Sep 2019 • Yi Luo, Deniz Mengu, Nezih T. Yardimci, Yair Rivenson, Muhammed Veli, Mona Jarrahi, Aydogan Ozcan

We report a broadband diffractive optical neural network design that simultaneously processes a continuum of wavelengths generated by a temporally-incoherent broadband source to all-optically perform a specific task learned using deep learning.

Paper
Add Code

Class-specific Differential Detection in Diffractive Optical Neural Networks Improves Inference Accuracy

no code implementations • 8 Jun 2019 • Jingxi Li, Deniz Mengu, Yi Luo, Yair Rivenson, Aydogan Ozcan

Similar to ensemble methods practiced in machine learning, we also independently-optimized multiple differential diffractive networks that optically project their light onto a common detector plane, and achieved testing accuracies of 98. 59%, 91. 06% and 51. 44% for MNIST, Fashion-MNIST and grayscale CIFAR-10, respectively.

BIG-bench Machine Learning General Classification

Paper
Add Code

Demand Prediction for Electric Vehicle Sharing

no code implementations • 10 Mar 2019 • Man Luo, Hongkai Wen, Yi Luo, Bowen Du, Konstantin Klemmer, Hong-Ming Zhu

Electric Vehicle (EV) sharing systems have recently experienced unprecedented growth across the globe.

Decision Making

Paper
Add Code

Adversarial Defense of Image Classification Using a Variational Auto-Encoder

1 code implementation • 7 Dec 2018 • Yi Luo, Henry Pfister

Deep neural networks are known to be vulnerable to adversarial attacks.

Adversarial Defense General Classification +1

Paper
Code

Response to Comment on "All-optical machine learning using diffractive deep neural networks"

no code implementations • 10 Oct 2018 • Deniz Mengu, Yi Luo, Yair Rivenson, Xing Lin, Muhammed Veli, Aydogan Ozcan

In their Comment, Wei et al. (arXiv:1809. 08360v1 [cs. LG]) claim that our original interpretation of Diffractive Deep Neural Networks (D2NN) represent a mischaracterization of the system due to linearity and passivity.

BIG-bench Machine Learning valid

Paper
Add Code

Analysis of Diffractive Optical Neural Networks and Their Integration with Electronic Neural Networks

no code implementations • 3 Oct 2018 • Deniz Mengu, Yi Luo, Yair Rivenson, Aydogan Ozcan

Furthermore, we report the integration of D2NNs with electronic neural networks to create hybrid-classifiers that significantly reduce the number of input pixels into an electronic network using an ultra-compact front-end D2NN with a layer-to-layer distance of a few wavelengths, also reducing the complexity of the successive electronic network.

BIG-bench Machine Learning General Classification

Paper
Add Code

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

16 code implementations • 20 Sep 2018 • Yi Luo, Nima Mesgarani

The majority of the previous methods have formulated the separation problem through the time-frequency representation of the mixed signal, which has several drawbacks, including the decoupling of the phase and magnitude of the signal, the suboptimality of time-frequency representation for speech separation, and the long latency in calculating the spectrograms.

Ranked #2 on Multi-task Audio Source Seperation on MTASS

Multi-task Audio Source Seperation Music Source Separation +3

7,635

Paper
Code

Real-time Single-channel Dereverberation and Separation with Time-domainAudio Separation Network

1 code implementation • ISCA Interspeech 2018 • Yi Luo, Nima Mesgarani

We investigate the recently proposed Time-domain Audio Sep-aration Network (TasNet) in the task of real-time single-channel speech dereverberation.

Ranked #28 on Speech Separation on WSJ0-2mix

Denoising Speech Dereverberation +1

2,105

Paper
Code

TasNet: time-domain audio separation network for real-time, single-channel speech separation

3 code implementations • 1 Nov 2017 • Yi Luo, Nima Mesgarani

We directly model the signal in the time-domain using an encoder-decoder framework and perform the source separation on nonnegative encoder outputs.

Ranked #30 on Speech Separation on WSJ0-2mix

Speech Separation

2,105

Paper
Code

Point Set Registration With Global-Local Correspondence and Transformation Estimation

no code implementations • ICCV 2017 • Su Zhang, Yang Yang, Kun Yang, Yi Luo, Sim-Heng Ong

We present a new point set registration method with global-local correspondence and transformation estimation (GL-CATE).

Paper
Add Code

Speaker-independent Speech Separation with Deep Attractor Network

no code implementations • 12 Jul 2017 • Yi Luo, Zhuo Chen, Nima Mesgarani

A reference point attractor is created in the embedding space to represent each speaker which is defined as the centroid of the speaker in the embedding space.

Speech Separation

Paper
Add Code

Deep attractor network for single-microphone speaker separation

1 code implementation • 27 Nov 2016 • Zhuo Chen, Yi Luo, Nima Mesgarani

We propose a novel deep learning framework for single channel speech separation by creating attractor points in high dimensional embedding space of the acoustic signals which pull together the time-frequency bins corresponding to each source.

Speaker Separation Speech Separation

Paper
Code

Deep Clustering and Conventional Networks for Music Separation: Stronger Together

no code implementations • 18 Nov 2016 • Yi Luo, Zhuo Chen, John R. Hershey, Jonathan Le Roux, Nima Mesgarani

Deep clustering is the first method to handle general audio separation scenarios with multiple sources of the same type and an arbitrary number of sources, performing impressively in speaker-independent speech separation tasks.

Clustering Deep Clustering +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.