Search Results for author: Han Zhang

Found 145 papers, 63 papers with code

QKFormer: Hierarchical Spiking Transformer using Q-K Attention

1 code implementation25 Mar 2024 Chenlin Zhou, Han Zhang, Zhaokun Zhou, Liutao Yu, Liwei Huang, Xiaopeng Fan, Li Yuan, Zhengyu Ma, Huihui Zhou, Yonghong Tian

ii) We incorporate the hierarchical structure, which significantly benefits the performance of both the brain and artificial neural networks, into spiking transformers to obtain multi-scale spiking representation.

FluoroSAM: A Language-aligned Foundation Model for X-ray Image Segmentation

1 code implementation12 Mar 2024 Benjamin D. Killeen, Liam J. Wang, Han Zhang, Mehran Armand, Russell H. Taylor, Greg Osgood, Mathias Unberath

Recently, foundation models (FMs) -- machine learning models trained on large amounts of highly variable data thus enabling broad applicability -- have emerged as promising tools for automated image analysis.

Image Segmentation Semantic Segmentation +1

MUC: Mixture of Uncalibrated Cameras for Robust 3D Human Body Reconstruction

no code implementations8 Mar 2024 Yitao Zhu, Sheng Wang, Mengjie Xu, Zixu Zhuang, Zhixin Wang, Kaidong Wang, Han Zhang, Qian Wang

Next, instead of simply averaging models across views, we train a network to determine the weights of individual views for their fusion, based on the parameters estimated for joints and hands of human body as well as camera positions.

QIS : Interactive Segmentation via Quasi-Conformal Mappings

no code implementations22 Feb 2024 Han Zhang, Daoping Zhang, Lok Ming Lui

In this paper, we propose the quasi-conformal interactive segmentation (QIS) model, which incorporates user input in the form of positive and negative clicks.

Image Segmentation Interactive Segmentation +2

COPR: Continual Human Preference Learning via Optimal Policy Regularization

no code implementations22 Feb 2024 Han Zhang, Lin Gui, Yu Lei, Yuanzhao Zhai, Yehong Zhang, Yulan He, Hui Wang, Yue Yu, Kam-Fai Wong, Bin Liang, Ruifeng Xu

Reinforcement Learning from Human Feedback (RLHF) is commonly utilized to improve the alignment of Large Language Models (LLMs) with human preferences.

Continual Learning

Federated Learning with Dual Attention for Robust Modulation Classification under Attacks

no code implementations19 Jan 2024 Han Zhang, Medhat Elsayed, Majid Bavand, Raimundas Gaigalas, Yigit Ozcan, Melike Erol-Kantarci

To this end, we leverage attention mechanisms as a defense against attacks in FL and propose a robust FL algorithm by integrating the attention mechanisms into the global model aggregation step.

Data Poisoning Federated Learning

Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation

no code implementations11 Jan 2024 Seung Hyun Lee, Yinxiao Li, Junjie Ke, Innfarn Yoo, Han Zhang, Jiahui Yu, Qifei Wang, Fei Deng, Glenn Entis, Junfeng He, Gang Li, Sangpil Kim, Irfan Essa, Feng Yang

Additionally, Parrot employs a joint optimization approach for the T2I model and the prompt expansion network, facilitating the generation of quality-aware text prompts, thus further enhancing the final image quality.

Reinforcement Learning (RL) Text-to-Image Generation

Fundamental Limitation of Semantic Communications: Neural Estimation for Rate-Distortion

no code implementations2 Jan 2024 Dongxu Li, Jianhao Huang, Chuan Huang, Xiaoqi Qin, Han Zhang, Ping Zhang

For the case with unknown semantic source distribution, while only a set of the source samples is available, we propose a neural-network-based method by leveraging the generative networks to learn the semantic source distribution.

Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles

no code implementations30 Dec 2023 Yuanzhao Zhai, Han Zhang, Yu Lei, Yue Yu, Kele Xu, Dawei Feng, Bo Ding, Huaimin Wang

Reinforcement learning from human feedback (RLHF) emerges as a promising paradigm for aligning large language models (LLMs).

Uncertainty Quantification

Gemini: A Family of Highly Capable Multimodal Models

no code implementations The Keyword 2023 Gemini Team, Rohan Anil, Sebastian Borgeaud, Yonghui Wu, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Slav Petrov, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, Ryan Doherty, Eli Collins, Clemens Meyer, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, George Tucker, Enrique Piqueras, Maxim Krikun, Iain Barr, Nikolay Savinov, Ivo Danihelka, Becca Roelofs, Anaïs White, Anders Andreassen, Tamara von Glehn, Lakshman Yagati, Mehran Kazemi, Lucas Gonzalez, Misha Khalman, Jakub Sygnowski, Alexandre Frechette, Charlotte Smith, Laura Culp, Lev Proleev, Yi Luan, Xi Chen, James Lottes, Nathan Schucher, Federico Lebron, Alban Rrustemi, Natalie Clay, Phil Crone, Tomas Kocisky, Jeffrey Zhao, Bartek Perz, Dian Yu, Heidi Howard, Adam Bloniarz, Jack W. Rae, Han Lu, Laurent SIfre, Marcello Maggioni, Fred Alcober, Dan Garrette, Megan Barnes, Shantanu Thakoor, Jacob Austin, Gabriel Barth-Maron, William Wong, Rishabh Joshi, Rahma Chaabouni, Deeni Fatiha, Arun Ahuja, Ruibo Liu, Yunxuan Li, Sarah Cogan, Jeremy Chen, Chao Jia, Chenjie Gu, Qiao Zhang, Jordan Grimstad, Ale Jakse Hartman, Martin Chadwick, Gaurav Singh Tomar, Xavier Garcia, Evan Senter, Emanuel Taropa, Thanumalayan Sankaranarayana Pillai, Jacob Devlin, Michael Laskin, Diego de Las Casas, Dasha Valter, Connie Tao, Lorenzo Blanco, Adrià Puigdomènech Badia, David Reitter, Mianna Chen, Jenny Brennan, Clara Rivera, Sergey Brin, Shariq Iqbal, Gabriela Surita, Jane Labanowski, Abhi Rao, Stephanie Winkler, Emilio Parisotto, Yiming Gu, Kate Olszewska, Yujing Zhang, Ravi Addanki, Antoine Miech, Annie Louis, Laurent El Shafey, Denis Teplyashin, Geoff Brown, Elliot Catt, Nithya Attaluri, Jan Balaguer, Jackie Xiang, Pidong Wang, Zoe Ashwood, Anton Briukhov, Albert Webson, Sanjay Ganapathy, Smit Sanghavi, Ajay Kannan, Ming-Wei Chang, Axel Stjerngren, Josip Djolonga, Yuting Sun, Ankur Bapna, Matthew Aitchison, Pedram Pejman, Henryk Michalewski, Tianhe Yu, Cindy Wang, Juliette Love, Junwhan Ahn, Dawn Bloxwich, Kehang Han, Peter Humphreys, Thibault Sellam, James Bradbury, Varun Godbole, Sina Samangooei, Bogdan Damoc, Alex Kaskasoli, Sébastien M. R. Arnold, Vijay Vasudevan, Shubham Agrawal, Jason Riesa, Dmitry Lepikhin, Richard Tanburn, Srivatsan Srinivasan, Hyeontaek Lim, Sarah Hodkinson, Pranav Shyam, Johan Ferret, Steven Hand, Ankush Garg, Tom Le Paine, Jian Li, Yujia Li, Minh Giang, Alexander Neitz, Zaheer Abbas, Sarah York, Machel Reid, Elizabeth Cole, Aakanksha Chowdhery, Dipanjan Das, Dominika Rogozińska, Vitaly Nikolaev, Pablo Sprechmann, Zachary Nado, Lukas Zilka, Flavien Prost, Luheng He, Marianne Monteiro, Gaurav Mishra, Chris Welty, Josh Newlan, Dawei Jia, Miltiadis Allamanis, Clara Huiyi Hu, Raoul de Liedekerke, Justin Gilmer, Carl Saroufim, Shruti Rijhwani, Shaobo Hou, Disha Shrivastava, Anirudh Baddepudi, Alex Goldin, Adnan Ozturel, Albin Cassirer, Yunhan Xu, Daniel Sohn, Devendra Sachan, Reinald Kim Amplayo, Craig Swanson, Dessie Petrova, Shashi Narayan, Arthur Guez, Siddhartha Brahma, Jessica Landon, Miteyan Patel, Ruizhe Zhao, Kevin Villela, Luyu Wang, Wenhao Jia, Matthew Rahtz, Mai Giménez, Legg Yeung, Hanzhao Lin, James Keeling, Petko Georgiev, Diana Mincu, Boxi Wu, Salem Haykal, Rachel Saputro, Kiran Vodrahalli, James Qin, Zeynep Cankara, Abhanshu Sharma, Nick Fernando, Will Hawkins, Behnam Neyshabur, Solomon Kim, Adrian Hutter, Priyanka Agrawal, Alex Castro-Ros, George van den Driessche, Tao Wang, Shuo-Yiin Chang, Paul Komarek, Ross Mcilroy, Mario Lučić, Guodong Zhang, Wael Farhan, Michael Sharman, Paul Natsev, Paul Michel, Yong Cheng, Yamini Bansal, Siyuan Qiao, Kris Cao, Siamak Shakeri, Christina Butterfield, Justin Chung, Paul Kishan Rubenstein, Shivani Agrawal, Arthur Mensch, Kedar Soparkar, Karel Lenc, Timothy Chung, Aedan Pope, Loren Maggiore, Jackie Kay, Priya Jhakra, Shibo Wang, Joshua Maynez, Mary Phuong, Taylor Tobin, Andrea Tacchetti, Maja Trebacz, Kevin Robinson, Yash Katariya, Sebastian Riedel, Paige Bailey, Kefan Xiao, Nimesh Ghelani, Lora Aroyo, Ambrose Slone, Neil Houlsby, Xuehan Xiong, Zhen Yang, Elena Gribovskaya, Jonas Adler, Mateo Wirth, Lisa Lee, Music Li, Thais Kagohara, Jay Pavagadhi, Sophie Bridgers, Anna Bortsova, Sanjay Ghemawat, Zafarali Ahmed, Tianqi Liu, Richard Powell, Vijay Bolina, Mariko Iinuma, Polina Zablotskaia, James Besley, Da-Woon Chung, Timothy Dozat, Ramona Comanescu, Xiance Si, Jeremy Greer, Guolong Su, Martin Polacek, Raphaël Lopez Kaufman, Simon Tokumine, Hexiang Hu, Elena Buchatskaya, Yingjie Miao, Mohamed Elhawaty, Aditya Siddhant, Nenad Tomasev, Jinwei Xing, Christina Greer, Helen Miller, Shereen Ashraf, Aurko Roy, Zizhao Zhang, Ada Ma, Angelos Filos, Milos Besta, Rory Blevins, Ted Klimenko, Chih-Kuan Yeh, Soravit Changpinyo, Jiaqi Mu, Oscar Chang, Mantas Pajarskas, Carrie Muir, Vered Cohen, Charline Le Lan, Krishna Haridasan, Amit Marathe, Steven Hansen, Sholto Douglas, Rajkumar Samuel, Mingqiu Wang, Sophia Austin, Chang Lan, Jiepu Jiang, Justin Chiu, Jaime Alonso Lorenzo, Lars Lowe Sjösund, Sébastien Cevey, Zach Gleicher, Thi Avrahami, Anudhyan Boral, Hansa Srinivasan, Vittorio Selo, Rhys May, Konstantinos Aisopos, Léonard Hussenot, Livio Baldini Soares, Kate Baumli, Michael B. Chang, Adrià Recasens, Ben Caine, Alexander Pritzel, Filip Pavetic, Fabio Pardo, Anita Gergely, Justin Frye, Vinay Ramasesh, Dan Horgan, Kartikeya Badola, Nora Kassner, Subhrajit Roy, Ethan Dyer, Víctor Campos, Alex Tomala, Yunhao Tang, Dalia El Badawy, Elspeth White, Basil Mustafa, Oran Lang, Abhishek Jindal, Sharad Vikram, Zhitao Gong, Sergi Caelles, Ross Hemsley, Gregory Thornton, Fangxiaoyu Feng, Wojciech Stokowiec, Ce Zheng, Phoebe Thacker, Çağlar Ünlü, Zhishuai Zhang, Mohammad Saleh, James Svensson, Max Bileschi, Piyush Patil, Ankesh Anand, Roman Ring, Katerina Tsihlas, Arpi Vezer, Marco Selvi, Toby Shevlane, Mikel Rodriguez, Tom Kwiatkowski, Samira Daruki, Keran Rong, Allan Dafoe, Nicholas FitzGerald, Keren Gu-Lemberg, Mina Khan, Lisa Anne Hendricks, Marie Pellat, Vladimir Feinberg, James Cobon-Kerr, Tara Sainath, Maribeth Rauh, Sayed Hadi Hashemi, Richard Ives, Yana Hasson, Yaguang Li, Eric Noland, Yuan Cao, Nathan Byrd, Le Hou, Qingze Wang, Thibault Sottiaux, Michela Paganini, Jean-Baptiste Lespiau, Alexandre Moufarek, Samer Hassan, Kaushik Shivakumar, Joost van Amersfoort, Amol Mandhane, Pratik Joshi, Anirudh Goyal, Matthew Tung, Andrew Brock, Hannah Sheahan, Vedant Misra, Cheng Li, Nemanja Rakićević, Mostafa Dehghani, Fangyu Liu, Sid Mittal, Junhyuk Oh, Seb Noury, Eren Sezener, Fantine Huot, Matthew Lamm, Nicola De Cao, Charlie Chen, Gamaleldin Elsayed, Ed Chi, Mahdis Mahdieh, Ian Tenney, Nan Hua, Ivan Petrychenko, Patrick Kane, Dylan Scandinaro, Rishub Jain, Jonathan Uesato, Romina Datta, Adam Sadovsky, Oskar Bunyan, Dominik Rabiej, Shimu Wu, John Zhang, Gautam Vasudevan, Edouard Leurent, Mahmoud Alnahlawi, Ionut Georgescu, Nan Wei, Ivy Zheng, Betty Chan, Pam G Rabinovitch, Piotr Stanczyk, Ye Zhang, David Steiner, Subhajit Naskar, Michael Azzam, Matthew Johnson, Adam Paszke, Chung-Cheng Chiu, Jaume Sanchez Elias, Afroz Mohiuddin, Faizan Muhammad, Jin Miao, Andrew Lee, Nino Vieillard, Sahitya Potluri, Jane Park, Elnaz Davoodi, Jiageng Zhang, Jeff Stanway, Drew Garmon, Abhijit Karmarkar, Zhe Dong, Jong Lee, Aviral Kumar, Luowei Zhou, Jonathan Evens, William Isaac, Zhe Chen, Johnson Jia, Anselm Levskaya, Zhenkai Zhu, Chris Gorgolewski, Peter Grabowski, Yu Mao, Alberto Magni, Kaisheng Yao, Javier Snaider, Norman Casagrande, Paul Suganthan, Evan Palmer, Geoffrey Irving, Edward Loper, Manaal Faruqui, Isha Arkatkar, Nanxin Chen, Izhak Shafran, Michael Fink, Alfonso Castaño, Irene Giannoumis, Wooyeol Kim, Mikołaj Rybiński, Ashwin Sreevatsa, Jennifer Prendki, David Soergel, Adrian Goedeckemeyer, Willi Gierke, Mohsen Jafari, Meenu Gaba, Jeremy Wiesner, Diana Gage Wright, Yawen Wei, Harsha Vashisht, Yana Kulizhskaya, Jay Hoover, Maigo Le, Lu Li, Chimezie Iwuanyanwu, Lu Liu, Kevin Ramirez, Andrey Khorlin, Albert Cui, Tian Lin, Marin Georgiev, Marcus Wu, Ricardo Aguilar, Keith Pallo, Abhishek Chakladar, Alena Repina, Xihui Wu, Tom van der Weide, Priya Ponnapalli, Caroline Kaplan, Jiri Simsa, Shuangfeng Li, Olivier Dousse, Jeff Piper, Nathan Ie, Minnie Lui, Rama Pasumarthi, Nathan Lintz, Anitha Vijayakumar, Lam Nguyen Thiet, Daniel Andor, Pedro Valenzuela, Cosmin Paduraru, Daiyi Peng, Katherine Lee, Shuyuan Zhang, Somer Greene, Duc Dung Nguyen, Paula Kurylowicz, Sarmishta Velury, Sebastian Krause, Cassidy Hardin, Lucas Dixon, Lili Janzer, Kiam Choo, Ziqiang Feng, Biao Zhang, Achintya Singhal, Tejasi Latkar, Mingyang Zhang, Quoc Le, Elena Allica Abellan, Dayou Du, Dan McKinnon, Natasha Antropova, Tolga Bolukbasi, Orgad Keller, David Reid, Daniel Finchelstein, Maria Abi Raad, Remi Crocker, Peter Hawkins, Robert Dadashi, Colin Gaffney, Sid Lall, Ken Franko, Egor Filonov, Anna Bulanova, Rémi Leblond, Vikas Yadav, Shirley Chung, Harry Askham, Luis C. Cobo, Kelvin Xu, Felix Fischer, Jun Xu, Christina Sorokin, Chris Alberti, Chu-Cheng Lin, Colin Evans, Hao Zhou, Alek Dimitriev, Hannah Forbes, Dylan Banarse, Zora Tung, Jeremiah Liu, Mark Omernick, Colton Bishop, Chintu Kumar, Rachel Sterneck, Ryan Foley, Rohan Jain, Swaroop Mishra, Jiawei Xia, Taylor Bos, Geoffrey Cideron, Ehsan Amid, Francesco Piccinno, Xingyu Wang, Praseem Banzal, Petru Gurita, Hila Noga, Premal Shah, Daniel J. Mankowitz, Alex Polozov, Nate Kushman, Victoria Krakovna, Sasha Brown, Mohammadhossein Bateni, Dennis Duan, Vlad Firoiu, Meghana Thotakuri, Tom Natan, Anhad Mohananey, Matthieu Geist, Sidharth Mudgal, Sertan Girgin, Hui Li, Jiayu Ye, Ofir Roval, Reiko Tojo, Michael Kwong, James Lee-Thorp, Christopher Yew, Quan Yuan, Sumit Bagri, Danila Sinopalnikov, Sabela Ramos, John Mellor, Abhishek Sharma, Aliaksei Severyn, Jonathan Lai, Kathy Wu, Heng-Tze Cheng, David Miller, Nicolas Sonnerat, Denis Vnukov, Rory Greig, Jennifer Beattie, Emily Caveness, Libin Bai, Julian Eisenschlos, Alex Korchemniy, Tomy Tsai, Mimi Jasarevic, Weize Kong, Phuong Dao, Zeyu Zheng, Frederick Liu, Fan Yang, Rui Zhu, Mark Geller, Tian Huey Teh, Jason Sanmiya, Evgeny Gladchenko, Nejc Trdin, Andrei Sozanschi, Daniel Toyama, Evan Rosen, Sasan Tavakkol, Linting Xue, Chen Elkind, Oliver Woodman, John Carpenter, George Papamakarios, Rupert Kemp, Sushant Kafle, Tanya Grunina, Rishika Sinha, Alice Talbert, Abhimanyu Goyal, Diane Wu, Denese Owusu-Afriyie, Cosmo Du, Chloe Thornton, Jordi Pont-Tuset, Pradyumna Narayana, Jing Li, Sabaer Fatehi, John Wieting, Omar Ajmeri, Benigno Uria, Tao Zhu, Yeongil Ko, Laura Knight, Amélie Héliou, Ning Niu, Shane Gu, Chenxi Pang, Dustin Tran, Yeqing Li, Nir Levine, Ariel Stolovich, Norbert Kalb, Rebeca Santamaria-Fernandez, Sonam Goenka, Wenny Yustalim, Robin Strudel, Ali Elqursh, Balaji Lakshminarayanan, Charlie Deck, Shyam Upadhyay, Hyo Lee, Mike Dusenberry, Zonglin Li, Xuezhi Wang, Kyle Levin, Raphael Hoffmann, Dan Holtmann-Rice, Olivier Bachem, Summer Yue, Sho Arora, Eric Malmi, Daniil Mirylenka, Qijun Tan, Christy Koh, Soheil Hassas Yeganeh, Siim Põder, Steven Zheng, Francesco Pongetti, Mukarram Tariq, Yanhua Sun, Lucian Ionita, Mojtaba Seyedhosseini, Pouya Tafti, Ragha Kotikalapudi, Zhiyu Liu, Anmol Gulati, Jasmine Liu, Xinyu Ye, Bart Chrzaszcz, Lily Wang, Nikhil Sethi, Tianrun Li, Ben Brown, Shreya Singh, Wei Fan, Aaron Parisi, Joe Stanton, Chenkai Kuang, Vinod Koverkathu, Christopher A. Choquette-Choo, Yunjie Li, TJ Lu, Abe Ittycheriah, Prakash Shroff, Pei Sun, Mani Varadarajan, Sanaz Bahargam, Rob Willoughby, David Gaddy, Ishita Dasgupta, Guillaume Desjardins, Marco Cornero, Brona Robenek, Bhavishya Mittal, Ben Albrecht, Ashish Shenoy, Fedor Moiseev, Henrik Jacobsson, Alireza Ghaffarkhah, Morgane Rivière, Alanna Walton, Clément Crepy, Alicia Parrish, YuAn Liu, Zongwei Zhou, Clement Farabet, Carey Radebaugh, Praveen Srinivasan, Claudia van der Salm, Andreas Fidjeland, Salvatore Scellato, Eri Latorre-Chimoto, Hanna Klimczak-Plucińska, David Bridson, Dario de Cesare, Tom Hudson, Piermaria Mendolicchio, Lexi Walker, Alex Morris, Ivo Penchev, Matthew Mauger, Alexey Guseynov, Alison Reid, Seth Odoom, Lucia Loher, Victor Cotruta, Madhavi Yenugula, Dominik Grewe, Anastasia Petrushkina, Tom Duerig, Antonio Sanchez, Steve Yadlowsky, Amy Shen, Amir Globerson, Adam Kurzrok, Lynette Webb, Sahil Dua, Dong Li, Preethi Lahoti, Surya Bhupatiraju, Dan Hurt, Haroon Qureshi, Ananth Agarwal, Tomer Shani, Matan Eyal, Anuj Khare, Shreyas Rammohan Belle, Lei Wang, Chetan Tekur, Mihir Sanjay Kale, Jinliang Wei, Ruoxin Sang, Brennan Saeta, Tyler Liechty, Yi Sun, Yao Zhao, Stephan Lee, Pandu Nayak, Doug Fritz, Manish Reddy Vuyyuru, John Aslanides, Nidhi Vyas, Martin Wicke, Xiao Ma, Taylan Bilal, Evgenii Eltyshev, Daniel Balle, Nina Martin, Hardie Cate, James Manyika, Keyvan Amiri, Yelin Kim, Xi Xiong, Kai Kang, Florian Luisier, Nilesh Tripuraneni, David Madras, Mandy Guo, Austin Waters, Oliver Wang, Joshua Ainslie, Jason Baldridge, Han Zhang, Garima Pruthi, Jakob Bauer, Feng Yang, Riham Mansour, Jason Gelman, Yang Xu, George Polovets, Ji Liu, Honglong Cai, Warren Chen, XiangHai Sheng, Emily Xue, Sherjil Ozair, Adams Yu, Christof Angermueller, Xiaowei Li, Weiren Wang, Julia Wiesinger, Emmanouil Koukoumidis, Yuan Tian, Anand Iyer, Madhu Gurumurthy, Mark Goldenson, Parashar Shah, MK Blake, Hongkun Yu, Anthony Urbanowicz, Jennimaria Palomaki, Chrisantha Fernando, Kevin Brooks, Ken Durden, Harsh Mehta, Nikola Momchev, Elahe Rahimtoroghi, Maria Georgaki, Amit Raul, Sebastian Ruder, Morgan Redshaw, Jinhyuk Lee, Komal Jalan, Dinghua Li, Ginger Perng, Blake Hechtman, Parker Schuh, Milad Nasr, Mia Chen, Kieran Milan, Vladimir Mikulik, Trevor Strohman, Juliana Franco, Tim Green, Demis Hassabis, Koray Kavukcuoglu, Jeffrey Dean, Oriol Vinyals

This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding.

Arithmetic Reasoning Code Generation +3

VideoLCM: Video Latent Consistency Model

2 code implementations14 Dec 2023 Xiang Wang, Shiwei Zhang, Han Zhang, Yu Liu, Yingya Zhang, Changxin Gao, Nong Sang

Consistency models have demonstrated powerful capability in efficient image generation and allowed synthesis within a few sampling steps, alleviating the high computational cost in diffusion models.

Computational Efficiency Image Generation +1

An Incentive Mechanism for Federated Learning Based on Multiple Resource Exchange

no code implementations13 Dec 2023 Ruonan Dong, Hui Xu, Han Zhang, Guopeng Zhang

We formulate the interaction between MO and DOs as an optimization problem, and the objective is to effectively utilize the communication and computing resource of the MO and DOs to minimize the time to complete an FL task.

Federated Learning

CCM: Adding Conditional Controls to Text-to-Image Consistency Models

no code implementations12 Dec 2023 Jie Xiao, Kai Zhu, Han Zhang, Zhiheng Liu, Yujun Shen, Yu Liu, Xueyang Fu, Zheng-Jun Zha

Consistency Models (CMs) have showed a promise in creating visual content efficiently and with high quality.

GFS: Graph-based Feature Synthesis for Prediction over Relational Databases

no code implementations4 Dec 2023 Han Zhang, Quan Gan, David Wipf, Weinan Zhang

Consequently, the prevalent approach for training machine learning models on data stored in relational databases involves performing feature engineering to merge the data from multiple tables into a single table and subsequently applying single table models.

Feature Engineering Inductive Bias

COPR: Continual Learning Human Preference through Optimal Policy Regularization

no code implementations24 Oct 2023 Han Zhang, Lin Gui, Yuanzhao Zhai, Hui Wang, Yu Lei, Ruifeng Xu

The technique of Reinforcement Learning from Human Feedback (RLHF) is a commonly employed method to improve pre-trained Language Models (LM), enhancing their ability to conform to human preferences.

Continual Learning reinforcement-learning

BatteryML:An Open-source platform for Machine Learning on Battery Degradation

1 code implementation23 Oct 2023 Han Zhang, Xiaofan Gui, Shun Zheng, Ziheng Lu, Yuqi Li, Jiang Bian

Battery degradation remains a pivotal concern in the energy storage domain, with machine learning emerging as a potent tool to drive forward insights and solutions.

Towards Foundation Models for Learning on Tabular Data

no code implementations11 Oct 2023 Han Zhang, Xumeng Wen, Shun Zheng, Wei Xu, Jiang Bian

Despite considerable efforts in developing effective learning models for tabular data, current transferable tabular models remain in their infancy, limited by either the lack of support for direct instruction following in new tasks or the neglect of acquiring foundational knowledge and capabilities from diverse tabular datasets.

Instruction Following Language Modelling +1

Accurate battery lifetime prediction across diverse aging conditions with deep learning

no code implementations8 Oct 2023 Han Zhang, Yuqi Li, Shun Zheng, Ziheng Lu, Xiaofan Gui, Wei Xu, Jiang Bian

Here we introduce a universal deep learning approach that is capable of accommodating various aging conditions and facilitating effective learning under low-resource conditions by leveraging data from rich conditions.

Single-cell modeling

Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency

no code implementations5 Oct 2023 Tianhong Li, Sangnie Bhardwaj, Yonglong Tian, Han Zhang, Jarred Barber, Dina Katabi, Guillaume Lajoie, Huiwen Chang, Dilip Krishnan

We demonstrate image generation and captioning performance on par with state-of-the-art text-to-image and image-to-text models with orders of magnitude fewer (only 3M) paired image-text data.

Text-to-Image Generation

Deformation-Invariant Neural Network and Its Applications in Distorted Image Restoration and Analysis

no code implementations4 Oct 2023 Han Zhang, Qiguang Chen, Lok Ming Lui

The QCTN is a deep neural network that outputs a quasiconformal map, which can be used to transform a geometrically distorted image into an improved version that is closer to the distribution of natural or good images.

Image Classification Image Restoration +1

Chat2Brain: A Method for Mapping Open-Ended Semantic Queries to Brain Activation Maps

no code implementations10 Sep 2023 Yaonai Wei, Tuo Zhang, Han Zhang, Tianyang Zhong, Lin Zhao, Zhengliang Liu, Chong Ma, Songyao Zhang, Muheng Shang, Lei Du, Xiao Li, Tianming Liu, Junwei Han

In this study, we propose a method called Chat2Brain that combines LLMs to basic text-2-image model, known as Text2Brain, to map open-ended semantic queries to brain activation maps in data-scarce and complex query environments.

Price-Discrimination Game for Distributed Resource Management in Federated Learning

no code implementations26 Aug 2023 Han Zhang, Halvin Yang, Guopeng Zhang

In vanilla federated learning (FL) such as FedAvg, the parameter server (PS) and multiple distributed clients can form a typical buyer's market, where the number of PS/buyers of FL services is far less than the number of clients/sellers.

Federated Learning Management

Differentiable Retrieval Augmentation via Generative Language Modeling for E-commerce Query Intent Classification

no code implementations18 Aug 2023 Chenyu Zhao, Yunjiang Jiang, Yiming Qiu, Han Zhang, Wen-Yun Yang

Retrieval augmentation, which enhances downstream models by a knowledge retriever and an external corpus instead of by merely increasing the number of model parameters, has been successfully applied to many natural language processing (NLP) tasks such as text classification, question answering and so on.

intent-classification Intent Classification +5

Errors Dynamics in Affine Group Systems

no code implementations31 Jul 2023 Xinghan Li, Jianqi Chen, Han Zhang, Jieqiang Wei, Junfeng Wu

In this paper, we focus on the error dynamics analysis for an affine group system under external disturbances or random noises.

NEON: Living Needs Prediction System in Meituan

no code implementations31 Jul 2023 Xiaochong Lan, Chen Gao, Shiqi Wen, Xiuqi Chen, Yingge Che, Han Zhang, Huazhou Wei, Hengliang Luo, Yong Li

To address these two challenges, we design a system of living NEeds predictiON named NEON, consisting of three phases: feature mining, feature fusion, and multi-task prediction.

ArrayBot: Reinforcement Learning for Generalizable Distributed Manipulation through Touch

no code implementations29 Jun 2023 Zhengrong Xue, Han Zhang, Jingwen Cheng, Zhengmao He, Yuanchen Ju, Changyi Lin, Gu Zhang, Huazhe Xu

We present ArrayBot, a distributed manipulation system consisting of a $16 \times 16$ array of vertically sliding pillars integrated with tactile sensors, which can simultaneously support, perceive, and manipulate the tabletop objects.

reinforcement-learning Reinforcement Learning (RL)

Eliminating Lipschitz Singularities in Diffusion Models

no code implementations20 Jun 2023 Zhantao Yang, Ruili Feng, Han Zhang, Yujun Shen, Kai Zhu, Lianghua Huang, Yifei Zhang, Yu Liu, Deli Zhao, Jingren Zhou, Fan Cheng

Diffusion models, which employ stochastic differential equations to sample images through integrals, have emerged as a dominant class of generative models.

Learning Disentangled Prompts for Compositional Image Synthesis

no code implementations1 Jun 2023 Kihyuk Sohn, Albert Shaw, Yuan Hao, Han Zhang, Luisa Polania, Huiwen Chang, Lu Jiang, Irfan Essa

We study domain-adaptive image synthesis, the problem of teaching pretrained image generative models a new style or concept from as few as one image to synthesize novel images, to better understand the compositional image synthesis.

Domain Adaptation Image Generation +1

Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation

1 code implementation NeurIPS 2023 Lisa Dunlap, Alyssa Umino, Han Zhang, Jiezhi Yang, Joseph E. Gonzalez, Trevor Darrell

As such, we explore how natural language descriptions of the domains seen in training data can be used with large vision models trained on diverse pretraining datasets to generate useful variations of the training data.

Domain Generalization Image Augmentation

Mining fMRI Dynamics with Parcellation Prior for Brain Disease Diagnosis

no code implementations4 May 2023 Xiaozhao Liu, Mianxin Liu, Lang Mei, Yuyao Zhang, Feng Shi, Han Zhang, Dinggang Shen

To characterize atypical brain dynamics under diseases, prevalent studies investigate functional magnetic resonance imaging (fMRI).

Graph Learning Multiple Instance Learning

Spikingformer: Spike-driven Residual Learning for Transformer-based Spiking Neural Network

1 code implementation24 Apr 2023 Chenlin Zhou, Liutao Yu, Zhaokun Zhou, Zhengyu Ma, Han Zhang, Huihui Zhou, Yonghong Tian

Based on this residual design, we develop Spikingformer, a pure transformer-based spiking neural network.

Identity Encoder for Personalized Diffusion

no code implementations14 Apr 2023 Yu-Chuan Su, Kelvin C. K. Chan, Yandong Li, Yang Zhao, Han Zhang, Boqing Gong, Huisheng Wang, Xuhui Jia

Our approach greatly reduces the overhead for personalized image generation and is more applicable in many potential applications.

Image Enhancement Image Generation

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning

1 code implementation ICCV 2023 Ligong Han, Yinxiao Li, Han Zhang, Peyman Milanfar, Dimitris Metaxas, Feng Yang

Diffusion models have achieved remarkable success in text-to-image generation, enabling the creation of high-quality images from text prompts or other modalities.

Data Augmentation Efficient Diffusion Personalization +1

Delayed and Indirect Impacts of Link Recommendations

no code implementations17 Mar 2023 Han Zhang, Shangen Lu, Yixin Wang, Mihaela Curmei

Moreover, we show that the effects of recommendations can persist in networks, in part due to their indirect impacts on natural dynamics even after recommendations are turned off.

counterfactual

Steering Prototypes with Prompt-tuning for Rehearsal-free Continual Learning

2 code implementations16 Mar 2023 Zhuowei Li, Long Zhao, Zizhao Zhang, Han Zhang, Di Liu, Ting Liu, Dimitris N. Metaxas

In the context of continual learning, prototypes-as representative class embeddings-offer advantages in memory conservation and the mitigation of catastrophic forgetting.

Class Incremental Learning Contrastive Learning +1

Streaming Kernel PCA Algorithm With Small Space

no code implementations8 Mar 2023 Yichuan Deng, Zhao Song, Zifan Wang, Han Zhang

The kernel method, which is commonly used in learning algorithms such as Support Vector Machines (SVMs), has also been applied in PCA algorithms.

StraIT: Non-autoregressive Generation with Stratified Image Transformer

no code implementations1 Mar 2023 Shengju Qian, Huiwen Chang, Yuanzhen Li, Zizhao Zhang, Jiaya Jia, Han Zhang

We propose Stratified Image Transformer(StraIT), a pure non-autoregressive(NAR) generative model that demonstrates superiority in high-quality image synthesis over existing autoregressive(AR) and diffusion models(DMs).

Image Generation

VQ3D: Learning a 3D-Aware Generative Model on ImageNet

no code implementations ICCV 2023 Kyle Sargent, Jing Yu Koh, Han Zhang, Huiwen Chang, Charles Herrmann, Pratul Srinivasan, Jiajun Wu, Deqing Sun

Recent work has shown the possibility of training generative models of 3D content from 2D image collections on small datasets corresponding to a single object class, such as human faces, animal faces, or cars.

Position

Evolution of Filter Bubbles and Polarization in News Recommendation

no code implementations26 Jan 2023 Han Zhang, Ziwei Zhu, James Caverlee

However, most existing work focuses on a static setting or over a short-time window, leaving open questions about the long-term and dynamic impacts of news recommendations.

News Recommendation Recommendation Systems

Muse: Text-To-Image Generation via Masked Generative Transformers

4 code implementations2 Jan 2023 Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan

Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.

Language Modelling Large Language Model +1

Enhanced Training of Query-Based Object Detection via Selective Query Recollection

2 code implementations CVPR 2023 Fangyi Chen, Han Zhang, Kai Hu, Yu-Kai Huang, Chenchen Zhu, Marios Savvides

This paper investigates a phenomenon where query-based object detectors mispredict at the last decoding stage while predicting correctly at an intermediate stage.

Attribute Object +2

FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation

1 code implementation ICCV 2023 Ronghui Li, Junfan Zhao, Yachao Zhang, Mingyang Su, Zeping Ren, Han Zhang, Yansong Tang, Xiu Li

To address these problems, we propose FineDance, which contains 14. 6 hours of music-dance paired data, with fine-grained hand motions, fine-grained genres (22 dance genres), and accurate posture.

Motion Synthesis Retrieval

Dimensionality-Varying Diffusion Process

no code implementations CVPR 2023 Han Zhang, Ruili Feng, Zhantao Yang, Lianghua Huang, Yu Liu, Yifei Zhang, Yujun Shen, Deli Zhao, Jingren Zhou, Fan Cheng

Diffusion models, which learn to reverse a signal destruction process to generate new data, typically require the signal at each step to have the same dimension.

Image Generation

Cost Splitting for Multi-Objective Conflict-Based Search

no code implementations23 Nov 2022 Cheng Ge, Han Zhang, Jiaoyang Li, Sven Koenig

Our theoretical results show that, when combined with either of these two new splitting strategies, MO-CBS maintains its completeness and optimality guarantees.

Multi-Agent Path Finding

Using Language to Extend to Unseen Domains

1 code implementation18 Oct 2022 Lisa Dunlap, Clara Mohri, Devin Guillory, Han Zhang, Trevor Darrell, Joseph E. Gonzalez, aditi raghunathan, Anja Rohrbach

It is expensive to collect training data for every possible domain that a vision model may encounter when deployed.

Domain Adaptation

Topology-Preserving Segmentation Network

no code implementations7 Oct 2022 Han Zhang, Lok Ming Lui

Comparing to the segmentation framework based on pixel-wise classification, deformation-based segmentation models that warp a template to enclose the regions are more convenient to enforce geometric constraints.

Image Segmentation Medical Image Segmentation +2

Towards Semi-automatic Detection and Localization of Indoor Accessibility Issues using Mobile Depth Scanning and Computer Vision

no code implementations5 Oct 2022 Xia Su, Kaiming Cheng, Han Zhang, Jaewook Lee, Jon E. Froehlich

To help improve the safety and accessibility of indoor spaces, researchers and health professionals have created assessment instruments that enable homeowners and trained experts to audit and improve homes.

Visual Prompt Tuning for Generative Transfer Learning

1 code implementation CVPR 2023 Kihyuk Sohn, Yuan Hao, José Lezama, Luisa Polania, Huiwen Chang, Han Zhang, Irfan Essa, Lu Jiang

We base our framework on state-of-the-art generative vision transformers that represent an image as a sequence of visual tokens to the autoregressive or non-autoregressive transformers.

Image Generation Transfer Learning +1

Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets

no code implementations15 Aug 2022 Hao Chen, Ran Tao, Han Zhang, Yidong Wang, Wei Ye, Jindong Wang, Guosheng Hu, Marios Savvides

Beyond classification, Conv-Adapter can generalize to detection and segmentation tasks with more than 50% reduction of parameters but comparable performance to the traditional full fine-tuning.

Transfer Learning

Pre-training Tasks for User Intent Detection and Embedding Retrieval in E-commerce Search

1 code implementation12 Aug 2022 Yiming Qiu, Chenyu Zhao, Han Zhang, Jingwei Zhuo, TianHao Li, Xiaowei Zhang, Songlin Wang, Sulong Xu, Bo Long, Wen-Yun Yang

BERT-style models pre-trained on the general corpus (e. g., Wikipedia) and fine-tuned on specific task corpus, have recently emerged as breakthrough techniques in many NLP tasks: question answering, text classification, sequence labeling and so on.

Intent Detection Question Answering +3

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

2 code implementations22 Jun 2022 Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, ZiRui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, Ben Hutchinson, Wei Han, Zarana Parekh, Xin Li, Han Zhang, Jason Baldridge, Yonghui Wu

We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge.

Machine Translation Text-to-Image Generation +1

Faithful Explanations for Deep Graph Models

no code implementations24 May 2022 Zifan Wang, Yuhang Yao, Chaoran Zhang, Han Zhang, Youjie Kang, Carlee Joe-Wong, Matt Fredrikson, Anupam Datta

Second, our analytical and empirical results demonstrate that feature attribution methods cannot capture the nonlinear effect of edge features, while existing subgraph explanation methods are not faithful.

Anomaly Detection

Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder

1 code implementation6 May 2022 Zhenghao Liu, Han Zhang, Chenyan Xiong, Zhiyuan Liu, Yu Gu, Xiaohua LI

These embeddings need to be high-dimensional to fit training signals and guarantee the retrieval effectiveness of dense retrievers.

Dimensionality Reduction Information Retrieval +1

Powering Finetuning in Few-Shot Learning: Domain-Agnostic Bias Reduction with Selected Sampling

no code implementations7 Apr 2022 Ran Tao, Han Zhang, Yutong Zheng, Marios Savvides

Class-agnostic bias is defined as the distribution shifting introduced by domain difference, which we propose Distribution Calibration Module(DCM) to reduce.

Few-Shot Learning

MaxViT: Multi-Axis Vision Transformer

14 code implementations4 Apr 2022 Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li

We also show that our proposed model expresses strong generative modeling capability on ImageNet, demonstrating the superior potential of MaxViT blocks as a universal vision module.

Image Classification object-detection +1

Unitail: Detecting, Reading, and Matching in Retail Scene

no code implementations1 Apr 2022 Fangyi Chen, Han Zhang, Zaiwang Li, Jiachen Dou, Shentong Mo, Hao Chen, Yongxin Zhang, Uzair Ahmed, Chenchen Zhu, Marios Savvides

To make full use of computer vision technology in stores, it is required to consider the actual needs that fit the characteristics of the retail scene.

Benchmarking Dense Object Detection +2

Givens Coordinate Descent Methods for Rotation Matrix Learning in Trainable Embedding Indexes

no code implementations ICLR 2022 Yunjiang Jiang, Han Zhang, Yiming Qiu, Yun Xiao, Bo Long, Wen-Yun Yang

Product quantization (PQ) coupled with a space rotation, is widely used in modern approximate nearest neighbor (ANN) search systems to significantly compress the disk storage for embeddings and speed up the inner product computation.

Quantization

Topology-Preserving Segmentation Network: A Deep Learning Segmentation Framework for Connected Component

no code implementations27 Feb 2022 Han Zhang, Lok Ming Lui

TPSN is a deformation-based model that yields a deformation map through a UNet, which takes the medical image and a template mask as inputs.

Image Segmentation Medical Image Segmentation +2

StyleBERT: Chinese pretraining by font style information

no code implementations21 Feb 2022 Chao Lv, Han Zhang, Xinkai Du, Yunhao Zhang, Ying Huang, Wenhao Li, Jia Han, Shanshan Gu

With the success of down streaming task using English pre-trained language model, the pre-trained Chinese language model is also necessary to get a better performance of Chinese NLP task.

Language Modelling

MaskGIT: Masked Generative Image Transformer

6 code implementations CVPR 2022 Huiwen Chang, Han Zhang, Lu Jiang, Ce Liu, William T. Freeman

At inference time, the model begins with generating all tokens of an image simultaneously, and then refines the image iteratively conditioned on the previous generation.

Image Manipulation Image Outpainting +1

MAXIM: Multi-Axis MLP for Image Processing

1 code implementation CVPR 2022 Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li

In this work, we present a multi-axis MLP based architecture called MAXIM, that can serve as an efficient and flexible general-purpose vision backbone for image processing tasks.

 Ranked #1 on Deblurring on RealBlur-J (using extra training data)

Deblurring Image Deblurring +6

ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation

2 code implementations31 Dec 2021 Han Zhang, Weichong Yin, Yewei Fang, Lanxin Li, Boqiang Duan, Zhihua Wu, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

To explore the landscape of large-scale pre-training for bidirectional text-image generation, we train a 10-billion parameter ERNIE-ViLG model on a large-scale dataset of 145 million (Chinese) image-text pairs which achieves state-of-the-art performance for both text-to-image and image-to-text tasks, obtaining an FID of 7. 9 on MS-COCO for text-to-image synthesis and best results on COCO-CN and AIC-ICC for image captioning.

Image Captioning Quantization +2

Learning to Prompt for Continual Learning

4 code implementations CVPR 2022 Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister

The mainstream paradigm behind continual learning has been to adapt the model parameters to non-stationary data distributions, where catastrophic forgetting is the central challenge.

Class Incremental Learning Image Classification

Temporal Transformer Networks with Self-Supervision for Action Recognition

no code implementations14 Dec 2021 Yongkang Zhang, Jun Li, Guoming Wu, Han Zhang, Zhiping Shi, Zhaoxun Liu, Zizhang Wu, Na Jiang

The temporal sequence self-supervision module we employ unprecedentedly adopts the streamlined strategy of "random batch random channel" to reverse the sequence of video frames, allowing robust extractions of motion information representation from inversed temporal dimensions and improving the generalization capability of the model.

Action Recognition Temporal Action Localization

BLT: Bidirectional Layout Transformer for Controllable Layout Generation

1 code implementation9 Dec 2021 Xiang Kong, Lu Jiang, Huiwen Chang, Han Zhang, Yuan Hao, Haifeng Gong, Irfan Essa

During inference, BLT first generates a draft layout from the input and then iteratively refines it into a high-quality layout by masking out low-confident attributes.

You Ought to Look Around: Precise, Large Span Action Detection

no code implementations 25th International Conference on Pattern Recognition (ICPR) 2021 Ge Pan, Han Zhang, Fan Yu, Yonghong Song, Yuanlin Zhang, Han Yuan

In this paper, we propose a method called YOLA (You Ought to Look Around) which includes three parts: 1) a robust backbone SPN-I3D for extracting spatio-temporal features.

Action Detection Action Localization

Vector-quantized Image Modeling with Improved VQGAN

5 code implementations ICLR 2022 Jiahui Yu, Xin Li, Jing Yu Koh, Han Zhang, Ruoming Pang, James Qin, Alexander Ku, Yuanzhong Xu, Jason Baldridge, Yonghui Wu

Motivated by this success, we explore a Vector-quantized Image Modeling (VIM) approach that involves pretraining a Transformer to predict rasterized image tokens autoregressively.

Image Generation Representation Learning +1

DeepAID: Interpreting and Improving Deep Learning-based Anomaly Detection in Security Applications

1 code implementation23 Sep 2021 Dongqi Han, Zhiliang Wang, Wenqi Chen, Ying Zhong, Su Wang, Han Zhang, Jiahai Yang, Xingang Shi, Xia Yin

Experimental results show that DeepAID can provide high-quality interpretations for unsupervised DL models while meeting the special requirements of security domains.

Anomaly Detection

Deep Image Synthesis from Intuitive User Input: A Review and Perspectives

no code implementations9 Jul 2021 Yuan Xue, Yuan-Chen Guo, Han Zhang, Tao Xu, Song-Hai Zhang, Xiaolei Huang

In many applications of computer graphics, art and design, it is desirable for a user to provide intuitive non-image input, such as text, sketch, stroke, graph or layout, and have a computer system automatically generate photo-realistic images that adhere to the input content.

Image Generation Image Retrieval +1

ViTGAN: Training GANs with Vision Transformers

3 code implementations ICLR 2022 Kwonjoon Lee, Huiwen Chang, Lu Jiang, Han Zhang, Zhuowen Tu, Ce Liu

Recently, Vision Transformers (ViTs) have shown competitive performance on image recognition while requiring less vision-specific inductive biases.

Image Generation

SearchGCN: Powering Embedding Retrieval by Graph Convolution Networks for E-Commerce Search

no code implementations1 Jul 2021 Xinlin Xia, Shang Wang, Han Zhang, Songlin Wang, Sulong Xu, Yun Xiao, Bo Long, Wen-Yun Yang

Graph convolution networks (GCN), which recently becomes new state-of-the-art method for graph node classification, recommendation and other applications, has not been successfully applied to industrial-scale search engine yet.

Node Classification Retrieval

Improved Transformer for High-Resolution GANs

1 code implementation NeurIPS 2021 Long Zhao, Zizhao Zhang, Ting Chen, Dimitris N. Metaxas, Han Zhang

Attention-based models, exemplified by the Transformer, can effectively model long range dependency, but suffer from the quadratic complexity of self-attention operation, making them difficult to be adopted for high-resolution image generation based on Generative Adversarial Networks (GANs).

Ranked #2 on Image Generation on CelebA 256x256 (FID metric)

Image Generation Vocal Bursts Intensity Prediction

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

6 code implementations26 May 2021 Zizhao Zhang, Han Zhang, Long Zhao, Ting Chen, Sercan O. Arik, Tomas Pfister

Hierarchical structures are popular in recent vision transformers, however, they require sophisticated designs and massive datasets to work well.

Image Classification Image Generation

Joint Learning of Deep Retrieval Model and Product Quantization based Embedding Index

1 code implementation9 May 2021 Han Zhang, Hongwei Shen, Yiming Qiu, Yunjiang Jiang, Songlin Wang, Sulong Xu, Yun Xiao, Bo Long, Wen-Yun Yang

Embedding index that enables fast approximate nearest neighbor(ANN) search, serves as an indispensable component for state-of-the-art deep retrieval systems.

Quantization Retrieval

Learning Hamiltonian dynamics by reservoir computer

no code implementations24 Apr 2021 Han Zhang, Huawei Fan, Liang Wang, Xingang Wang

Reconstructing the KAM dynamics diagram of Hamiltonian system from the time series of a limited number of parameters is an outstanding question in nonlinear science, especially when the Hamiltonian governing the system dynamics are unknown.

Time Series Time Series Analysis

Transfer training from smaller language model

no code implementations23 Apr 2021 Han Zhang

We initialize a larger target model from a smaller source model by copy weight values from source model and padding with zeros or small initialization values on it to make the source and target model have approximate outputs, which is valid due to block matrix multiplication and residual connection in transformer structure.

Language Modelling Large Language Model +1

Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction

1 code implementation ICLR 2021 Wonkwang Lee, Whie Jung, Han Zhang, Ting Chen, Jing Yu Koh, Thomas Huang, Hyungsuk Yoon, Honglak Lee, Seunghoon Hong

Despite the recent advances in the literature, existing approaches are limited to moderately short-term prediction (less than a few seconds), while extrapolating it to a longer future quickly leads to destruction in structure and content.

Translation Video Prediction

A Multiscale Graph Convolutional Network for Change Detection in Homogeneous and Heterogeneous Remote Sensing Images

no code implementations16 Feb 2021 Junzheng Wu, Biao Li, Yao Qin, Weiping Ni, Han Zhang, Yuli Sun

In this paper, a novel CD method based on the graph convolutional network (GCN) and multiscale object-based technique is proposed for both homogeneous and heterogeneous images.

Change Detection

Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework

1 code implementation15 Dec 2020 Han Zhang, Wenhao Zheng, Charley Chen, Kevin Gao, Yao Hu, Ling Huang, Wei Xu

Meanwhile, such applications usually require modeling the intrinsic clusters in high-dimensional data, which usually displays heterogeneous statistical patterns as the patterns of different clusters may appear in different dimensions.

Anomaly Detection Fraud Detection

Transfer learning of chaotic systems

no code implementations15 Nov 2020 Yali Guo, Han Zhang, Liang Wang, Huawei Fan, Xingang Wang

Here we investigate transfer learning of chaotic systems from the perspective of synchronization-based state inference, in which a reservoir computer trained by chaotic system A is used to infer the unmeasured variables of chaotic system B, while A is different from B in either parameter or dynamics.

General Knowledge Time Series +2

Co-evolution of Functional Brain Network at Multiple Scales during Early Infancy

no code implementations15 Sep 2020 Xuyun Wen, Liming Hsu, Weili Lin, Han Zhang, Dinggang Shen

By applying our proposed methodological framework on the collected longitudinal infant dataset, we provided the first evidence that, in the first 2 years of life, the brain functional network is co-evolved at different scales, where each scale displays the unique reconfiguration pattern in terms of modular organization.

GloDyNE: Global Topology Preserving Dynamic Network Embedding

2 code implementations5 Aug 2020 Chengbin Hou, Han Zhang, Shan He, Ke Tang

The main and common objective of Dynamic Network Embedding (DNE) is to efficiently update node embeddings while preserving network topology at each time step.

Graph Reconstruction Incremental Learning +1

Improving NER's Performance with Massive financial corpus

1 code implementation31 Jul 2020 Han Zhang

Training large deep neural networks needs massive high quality annotation data, but the time and labor costs are too expensive for small business.

Language Modelling

From Spectrum Wavelet to Vertex Propagation: Graph Convolutional Networks Based on Taylor Approximation

no code implementations1 Jul 2020 Songyang Zhang, Han Zhang, Shuguang Cui, Zhi Ding

Graph convolutional networks (GCN) have been recently utilized to extract the underlying structures of datasets with some labeled data and high-dimensional features.

Node Classification

A Hybrid Evolutionary Algorithm for Reliable Facility Location Problem

no code implementations27 Jun 2020 Han Zhang, Jialin Liu, Xin Yao

The reliable facility location problem (RFLP) is an important research topic of operational research and plays a vital role in the decision-making and management of modern supply chain and logistics.

Decision Making Management

Image Augmentations for GAN Training

no code implementations4 Jun 2020 Zhengli Zhao, Zizhao Zhang, Ting Chen, Sameer Singh, Han Zhang

We provide new state-of-the-art results for conditional generation on CIFAR-10 with both consistency loss and contrastive loss as additional regularizations.

Image Augmentation Image Generation

Towards Personalized and Semantic Retrieval: An End-to-End Solution for E-commerce Search via Embedding Learning

no code implementations3 Jun 2020 Han Zhang, Songlin Wang, Kang Zhang, Zhiling Tang, Yunjiang Jiang, Yun Xiao, Weipeng Yan, Wen-Yun Yang

Two critical challenges stay in today's e-commerce search: how to retrieve items that are semantically relevant but not exact matching to query terms, and how to retrieve items that are more personalized to different users for the same search query.

Retrieval Semantic Retrieval

A Simple Semi-Supervised Learning Framework for Object Detection

7 code implementations10 May 2020 Kihyuk Sohn, Zizhao Zhang, Chun-Liang Li, Han Zhang, Chen-Yu Lee, Tomas Pfister

Semi-supervised learning (SSL) has a potential to improve the predictive performance of machine learning models using unlabeled data.

Ranked #13 on Semi-Supervised Object Detection on COCO 100% labeled data (using extra training data)

Data Augmentation Image Classification +4

ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring

1 code implementation ICLR 2020 David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, Colin Raffel

We improve the recently-proposed ``MixMatch semi-supervised learning algorithm by introducing two new techniques: distribution alignment and augmentation anchoring.

Solving Missing-Annotation Object Detection with Background Recalibration Loss

2 code implementations12 Feb 2020 Han Zhang, Fangyi Chen, Zhiqiang Shen, Qiqi Hao, Chenchen Zhu, Marios Savvides

In this paper, we introduce a superior solution called Background Recalibration Loss (BRL) that can automatically re-calibrate the loss signals according to the pre-defined IoU threshold and input image.

Object object-detection +1

Improved Consistency Regularization for GANs

no code implementations11 Feb 2020 Zhengli Zhao, Sameer Singh, Honglak Lee, Zizhao Zhang, Augustus Odena, Han Zhang

Recent work has increased the performance of Generative Adversarial Networks (GANs) by enforcing a consistency cost on the discriminator.

Image Generation

ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation

4 code implementations26 Jan 2020 Dongling Xiao, Han Zhang, Yukun Li, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks.

 Ranked #1 on Question Generation on SQuAD1.1 (using extra training data)

Abstractive Text Summarization Dialogue Generation +3

MANELA: A Multi-Agent Algorithm for Learning Network Embeddings

no code implementations1 Dec 2019 Han Zhang, Hong Xu

On the other hand, learning network embeddings on distributively stored networks still remained understudied: To the best of our knowledge, all existing algorithms for learning network embeddings have hitherto been exclusively centralized and thus cannot be applied to these networks.

BIG-bench Machine Learning Network Embedding

Multimodal, Multilingual Grapheme-to-Phoneme Conversion for Low-Resource Languages

no code implementations WS 2019 James Route, Steven Hillis, Isak Czeresnia Etinger, Han Zhang, Alan W. black

Grapheme-to-phoneme conversion (g2p) is the task of predicting the pronunciation of words from their orthographic representation.

Small-GAN: Speeding Up GAN Training Using Core-sets

no code implementations ICML 2020 Samarth Sinha, Han Zhang, Anirudh Goyal, Yoshua Bengio, Hugo Larochelle, Augustus Odena

Recent work by Brock et al. (2018) suggests that Generative Adversarial Networks (GANs) benefit disproportionately from large mini-batch sizes.

Active Learning Anomaly Detection +1

Differentiable Combinatorial Losses through Generalized Gradients of Linear Programs

no code implementations18 Oct 2019 Xi Gao, Han Zhang, Aliakbar Panahi, Tom Arodz

When samples have internal structure, we often see a mismatch between the objective optimized during training and the model's goal during inference.

Combinatorial Optimization Graph Matching +1

Distilling Effective Supervision from Severe Label Noise

2 code implementations CVPR 2020 Zizhao Zhang, Han Zhang, Sercan O. Arik, Honglak Lee, Tomas Pfister

For instance, on CIFAR100 with a $40\%$ uniform noise ratio and only 10 trusted labeled data per class, our method achieves $80. 2{\pm}0. 3\%$ classification accuracy, where the error rate is only $1. 4\%$ higher than a neural network trained without label noise.

Image Classification

Distributed Equivalent Substitution Training for Large-Scale Recommender Systems

no code implementations10 Sep 2019 Haidong Rong, Yangzihao Wang, Feihu Zhou, Junjie Zhai, Haiyang Wu, Rui Lan, Fan Li, Han Zhang, Yuekui Yang, Zhenyu Guo, Di Wang

We present Distributed Equivalent Substitution (DES) training, a novel distributed training framework for large-scale recommender systems with dynamic sparse features.

Recommendation Systems

Multiple instance dense connected convolution neural network for aerial image scene classification

no code implementations22 Aug 2019 Qi Bi, Kun Qin, Zhili Li, Han Zhang, Kai Xu

While the current convolution neural network tends to extract global features and global semantic information in a scene, the geo-spatial objects can be located at anywhere in an aerial image scene and their spatial arrangement tends to be more complicated.

General Classification Scene Classification

Building change detection based on multi-scale filtering and grid partition

no code implementations22 Aug 2019 Qi Bi, Kun Qin, Han Zhang, Wenjun Han, Zhili Li, Kai Xu

Exhaustive experiments indicate that the proposed method can detect building change types directly and outperform the current multi-index learning method.

Change Detection

Approximation Capabilities of Neural ODEs and Invertible Residual Networks

no code implementations ICML 2020 Han Zhang, Xi Gao, Jacob Unterman, Tom Arodz

Neural ODEs and i-ResNet are recently proposed methods for enforcing invertibility of residual neural models.

DynWalks: Global Topology and Recent Changes Awareness Dynamic Network Embedding

2 code implementations arXiv 2019 Chengbin Hou, Han Zhang, Ke Tang, Shan He

Dynamic network embedding aims to learn low dimensional embeddings for unseen and seen nodes by using any currently available snapshots of a dynamic network.

Graph Reconstruction Link Prediction +1

Brain Network Construction and Classification Toolbox (BrainNetClass)

1 code implementation17 Jun 2019 Zhen Zhou, Xiaobo Chen, Yu Zhang, Lishan Qiao, Renping Yu, Gang Pan, Han Zhang, Dinggang Shen

The goal of this work is to introduce a toolbox namely "Brain Network Construction and Classification" (BrainNetClass) to the field to promote more advanced brain network construction methods.

Classification General Classification

Deep Learning for Signal Demodulation in Physical Layer Wireless Communications: Prototype Platform, Open Dataset, and Analytics

no code implementations8 Mar 2019 Hongmei Wang, Zhenzhen Wu, Shuai Ma, Songtao Lu, Han Zhang, Guoru Ding, Shiyin Li

In this paper, we investigate deep learning (DL)-enabled signal demodulation methods and establish the first open dataset of real modulated signals for wireless communication systems.

Multi-Antenna Channel Interpolation via Tucker Decomposed Extreme Learning Machine

no code implementations26 Dec 2018 Han Zhang, Bo Ai, Wenjun Xu, Li Xu, Shuguang Cui

Channel interpolation is an essential technique for providing high-accuracy estimation of the channel state information (CSI) for wireless systems design where the frequency-space structural correlations of multi-antenna channel are typically hidden in matrix or tensor forms.

A Teacher-Student Framework for Maintainable Dialog Manager

no code implementations EMNLP 2018 Weikang Wang, Jiajun Zhang, Han Zhang, Mei-Yuh Hwang, Cheng-qing Zong, Zhifei Li

Specifically, the {``}student{''} is an extended dialog manager based on a new ontology, and the {``}teacher{''} is existing resources used for guiding the learning process of the {``}student{''}.

Reinforcement Learning (RL)

A Unified Mammogram Analysis Method via Hybrid Deep Supervision

1 code implementation31 Aug 2018 Rongzhao Zhang, Han Zhang, Albert C. S. Chung

In this work, we present a unified mammogram analysis framework for both whole-mammogram classification and segmentation.

Classification General Classification +3

Deep Chronnectome Learning via Full Bidirectional Long Short-Term Memory Networks for MCI Diagnosis

no code implementations30 Aug 2018 Weizheng Yan, Han Zhang, Jing Sui, Dinggang Shen

Dynamic functional connectivity (dFC), consisting of time-varying spatiotemporal dynamics, may characterize "chronnectome" diagnostic information for improving MCI classification.

Clustering General Classification +1

Self-Attention Generative Adversarial Networks

49 code implementations arXiv 2018 Han Zhang, Ian Goodfellow, Dimitris Metaxas, Augustus Odena

In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks.

Conditional Image Generation Generative Adversarial Network

Improving GANs Using Optimal Transport

2 code implementations ICLR 2018 Tim Salimans, Han Zhang, Alec Radford, Dimitris Metaxas

We present Optimal Transport GAN (OT-GAN), a variant of generative adversarial nets minimizing a new metric measuring the distance between the generator distribution and the data distribution.

Image Generation

AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

19 code implementations CVPR 2018 Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He

In this paper, we propose an Attentional Generative Adversarial Network (AttnGAN) that allows attention-driven, multi-stage refinement for fine-grained text-to-image generation.

Generative Adversarial Network Image-text matching +2

Link the head to the "beak": Zero Shot Learning from Noisy Text Description at Part Precision

no code implementations CVPR 2017 Mohamed Elhoseiny, Yizhe Zhu, Han Zhang, Ahmed Elgammal

We propose a learning framework that is able to connect text terms to its relevant parts and suppress connections to non-visual text terms without any part-text annotations.

Zero-Shot Learning

SegAN: Adversarial Network with Multi-scale $L_1$ Loss for Medical Image Segmentation

2 code implementations6 Jun 2017 Yuan Xue, Tao Xu, Han Zhang, Rodney Long, Xiaolei Huang

Extensive experimental results demonstrate the effectiveness of the proposed SegAN with multi-scale loss: on BRATS 2013 SegAN gives performance comparable to the state-of-the-art for whole tumor and tumor core segmentation while achieves better precision and sensitivity for Gd-enhance tumor core segmentation; on BRATS 2015 SegAN achieves better performance than the state-of-the-art in both dice score and precision.

Brain Tumor Segmentation Image Segmentation +2

Multi-lingual Geoparsing based on Machine Translation

no code implementations6 Nov 2015 Xu Chen, Han Zhang, Judith Gelernter

Our results for geoparsing Chinese and Arabic text using our multi-lingual geoparsing method are comparable to our results for geoparsing English text with our English tools.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.