Search Results for author: James T. Kwok

Found 55 papers, 15 papers with code

Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation

no code implementations14 Mar 2024 Yunhao Gou, Kai Chen, Zhili Liu, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-yan Yeung, James T. Kwok, Yu Zhang

Multimodal large language models (MLLMs) have shown impressive reasoning abilities, which, however, are also more vulnerable to jailbreak attacks than their LLM predecessors.

Optical Character Recognition (OCR)

KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion

1 code implementation4 Feb 2024 Yanbin Wei, Qiushi Huang, James T. Kwok, Yu Zhang

Knowledge Graph Completion (KGC) is crucial for addressing knowledge graph incompleteness and supporting downstream applications.

In-Context Learning Language Modelling +1

Rendering Graphs for Graph Reasoning in Multimodal Large Language Models

no code implementations3 Feb 2024 Yanbin Wei, Shuai Fu, Weisen Jiang, James T. Kwok, Yu Zhang

In this paper, we take the first step in incorporating visual information into graph reasoning tasks and propose a new benchmark GITQA, where each sample is a tuple (graph, image, textual description).

Common Sense Reasoning Knowledge Graph Completion

Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning

no code implementations19 Dec 2023 Yunhao Gou, Zhili Liu, Kai Chen, Lanqing Hong, Hang Xu, Aoxue Li, Dit-yan Yeung, James T. Kwok, Yu Zhang

Instruction tuning of Large Vision-language Models (LVLMs) has revolutionized the development of versatile models with zero-shot generalization across a wide range of downstream vision-language tasks.

Instruction Following Zero-shot Generalization

Aggregation Weighting of Federated Learning via Generalization Bound Estimation

no code implementations10 Nov 2023 Mingwei Xu, Xiaofeng Cao, Ivor W. Tsang, James T. Kwok

In this paper, we replace the aforementioned weighting method with a new strategy that considers the generalization bounds of each local model.

Federated Learning Generalization Bounds

BYOM: Building Your Own Multi-Task Model For Free

no code implementations3 Oct 2023 Weisen Jiang, Baijiong Lin, Han Shi, Yu Zhang, Zhenguo Li, James T. Kwok

Recently, various merging methods have been proposed to build a multi-task model from task-specific finetuned models without retraining.

Domain-Guided Conditional Diffusion Model for Unsupervised Domain Adaptation

no code implementations23 Sep 2023 Yulong Zhang, Shuhao Chen, Weisen Jiang, Yu Zhang, Jiangang Lu, James T. Kwok

However, the performance of existing UDA methods is constrained by the large domain shift and limited target domain data.

Unsupervised Domain Adaptation

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

1 code implementation21 Sep 2023 Longhui Yu, Weisen Jiang, Han Shi, Jincheng Yu, Zhengying Liu, Yu Zhang, James T. Kwok, Zhenguo Li, Adrian Weller, Weiyang Liu

Our MetaMath-7B model achieves 66. 4% on GSM8K and 19. 4% on MATH, exceeding the state-of-the-art models of the same size by 11. 5% and 8. 7%.

Ranked #53 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +4

Dual-Balancing for Multi-Task Learning

1 code implementation23 Aug 2023 Baijiong Lin, Weisen Jiang, Feiyang Ye, Yu Zhang, Pengguang Chen, Ying-Cong Chen, Shu Liu, James T. Kwok

Multi-task learning (MTL), a learning paradigm to learn multiple related tasks simultaneously, has achieved great success in various fields.

Multi-Task Learning

Forward-Backward Reasoning in Large Language Models for Mathematical Verification

no code implementations15 Aug 2023 Weisen Jiang, Han Shi, Longhui Yu, Zhengying Liu, Yu Zhang, Zhenguo Li, James T. Kwok

Instead of using forward or backward reasoning alone, we propose FOBAR to combine FOrward and BAckward Reasoning for verification.

Mathematical Reasoning

Effective Structured Prompting by Meta-Learning and Representative Verbalizer

1 code implementation1 Jun 2023 Weisen Jiang, Yu Zhang, James T. Kwok

Combining meta-learning the prompt pool and RepVerb, we propose MetaPrompter for effective structured prompting.

Meta-Learning

A Survey on Time-Series Pre-Trained Models

1 code implementation18 May 2023 Qianli Ma, Zhen Liu, Zhenjing Zheng, Ziyang Huang, Siying Zhu, Zhongzhong Yu, James T. Kwok

Time-Series Mining (TSM) is an important research area since it shows great potential in practical applications.

Time Series Transfer Learning

A Survey of Learning on Small Data: Generalization, Optimization, and Challenge

no code implementations29 Jul 2022 Xiaofeng Cao, Weixin Bu, Shengjun Huang, MinLing Zhang, Ivor W. Tsang, Yew Soon Ong, James T. Kwok

In future, learning on small data that approximates the generalization ability of big data is one of the ultimate purposes of AI, which requires machines to recognize objectives and scenarios relying on small data as humans.

Active Learning Contrastive Learning +4

Black-box Generalization of Machine Teaching

no code implementations30 Jun 2022 Xiaofeng Cao, Yaming Guo, Ivor W. Tsang, James T. Kwok

An inherent assumption is that this learning manner can derive those updates into the optimal hypothesis.

Active Learning

Dropout's Dream Land: Generalization from Learned Simulators to Reality

1 code implementation17 Sep 2021 Zac Wellmer, James T. Kwok

By training the World Model using dropout, the dream environment is capable of creating a nearly infinite number of different dream environments.

Feedback Pyramid Attention Networks for Single Image Super-Resolution

no code implementations13 Jun 2021 Huapeng Wu, Jie Gui, Jun Zhang, James T. Kwok, Zhihui Wei

Recently, convolutional neural network (CNN) based image super-resolution (SR) methods have achieved significant performance improvement.

Image Super-Resolution

Pyramidal Dense Attention Networks for Lightweight Image Super-Resolution

no code implementations13 Jun 2021 Huapeng Wu, Jie Gui, Jun Zhang, James T. Kwok, Zhihui Wei

Recently, deep convolutional neural network methods have achieved an excellent performance in image superresolution (SR), but they can not be easily applied to embedded devices due to large memory cost.

Image Super-Resolution

TOHAN: A One-step Approach towards Few-shot Hypothesis Adaptation

1 code implementation NeurIPS 2021 Haoang Chi, Feng Liu, Wenjing Yang, Long Lan, Tongliang Liu, Bo Han, William K. Cheung, James T. Kwok

To this end, we propose a target orientated hypothesis adaptation network (TOHAN) to solve the FHA problem, where we generate highly-compatible unlabeled data (i. e., an intermediate domain) to help train a target-domain classifier.

Domain Adaptation

SparseBERT: Rethinking the Importance Analysis in Self-attention

1 code implementation25 Feb 2021 Han Shi, Jiahui Gao, Xiaozhe Ren, Hang Xu, Xiaodan Liang, Zhenguo Li, James T. Kwok

A surprising result is that diagonal elements in the attention map are the least important compared with other attention positions.

A Survey of Label-noise Representation Learning: Past, Present and Future

1 code implementation9 Nov 2020 Bo Han, Quanming Yao, Tongliang Liu, Gang Niu, Ivor W. Tsang, James T. Kwok, Masashi Sugiyama

Classical machine learning implicitly assumes that labels of the training data are sampled from a clean distribution, which can be too restrictive for real-world scenarios.

BIG-bench Machine Learning Learning Theory +1

A Scalable, Adaptive and Sound Nonconvex Regularizer for Low-rank Matrix Completion

no code implementations14 Aug 2020 Yaqing Wang, Quanming Yao, James T. Kwok

Extensive low-rank matrix completion experiments on a number of synthetic and real-world data sets show that the proposed method obtains state-of-the-art recovery performance while being the fastest in comparison to existing low-rank matrix learning methods.

Collaborative Filtering Low-Rank Matrix Completion

Effective Decoding in Graph Auto-Encoder using Triadic Closure

no code implementations26 Nov 2019 Han Shi, Haozheng Fan, James T. Kwok

We propose the triad decoder, which considers and predicts the three edges involved in a local triad together.

Clustering Graph Generation +4

Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS

1 code implementation NeurIPS 2020 Han Shi, Renjie Pi, Hang Xu, Zhenguo Li, James T. Kwok, Tong Zhang

In this work, we propose BONAS (Bayesian Optimized Neural Architecture Search), a sample-based NAS framework which is accelerated using weight-sharing to evaluate multiple related architectures simultaneously.

Bayesian Optimization Neural Architecture Search

Multi-objective Neural Architecture Search via Predictive Network Performance Optimization

no code implementations25 Sep 2019 Han Shi, Renjie Pi, Hang Xu, Zhenguo Li, James T. Kwok, Tong Zhang

Inspired by the nature of the graph structure of a neural network, we propose BOGCN-NAS, a NAS algorithm using Bayesian Optimization with Graph Convolutional Network (GCN) predictor.

Bayesian Optimization Neural Architecture Search

Communication-Efficient Distributed Blockwise Momentum SGD with Error-Feedback

1 code implementation NeurIPS 2019 Shuai Zheng, Ziyue Huang, James T. Kwok

In particular, on distributed ResNet training with 7 workers on the ImageNet, the proposed algorithm achieves the same testing accuracy as momentum SGD using full-precision gradients, but with $46\%$ less wall clock time.

Quantization

Blockwise Adaptivity: Faster Training and Better Generalization in Deep Learning

no code implementations23 May 2019 Shuai Zheng, James T. Kwok

Stochastic methods with coordinate-wise adaptive stepsize (such as RMSprop and Adam) have been widely used in training deep neural networks.

Analysis of Quantized Models

no code implementations ICLR 2019 Lu Hou, Ruiliang Zhang, James T. Kwok

We show that (i) weight-quantized networks converge to an error related to the weight quantization resolution and weight dimension; (ii) quantizing gradients slows convergence by a factor related to the gradient quantization resolution and dimension; and (iii) clipping the gradient before quantization renders this factor dimension-free, thus allowing the use of fewer bits for gradient quantization.

Quantization

General Convolutional Sparse Coding with Unknown Noise

no code implementations8 Mar 2019 Yaqing Wang, James T. Kwok, Lionel M. Ni

However, existing CSC methods can only model noises from Gaussian distribution, which is restrictive and unrealistic.

Differential Private Stack Generalization with an Application to Diabetes Prediction

no code implementations23 Nov 2018 Quanming Yao, Xiawei Guo, James T. Kwok, WeiWei Tu, Yuqiang Chen, Wenyuan Dai, Qiang Yang

To meet the standard of differential privacy, noise is usually added into the original data, which inevitably deteriorates the predicting performance of subsequent learning algorithms.

Diabetes Prediction Ensemble Learning +3

FasTer: Fast Tensor Completion with Nonconvex Regularization

1 code implementation23 Jul 2018 Quanming Yao, James T. Kwok, Bo Han

Due to the easy optimization, the convex overlapping nuclear norm has been popularly used for tensor completion.

Lightweight Stochastic Optimization for Minimizing Finite Sums with Infinite Data

no code implementations ICML 2018 Shuai Zheng, James T. Kwok

The memory cost of SSAG does not depend on the sample size, while that of S-SAGA is the same as those of variance reduction methods on un- perturbed data.

Data Augmentation Stochastic Optimization

Power Law in Sparsified Deep Neural Networks

no code implementations4 May 2018 Lu Hou, James T. Kwok

The power law has been observed in the degree distributions of many biological neural networks.

Continual Learning

Online Convolutional Sparse Coding with Sample-Dependent Dictionary

no code implementations ICML 2018 Yaqing Wang, Quanming Yao, James T. Kwok, Lionel M. Ni

Convolutional sparse coding (CSC) has been popularly used for the learning of shift-invariant dictionaries in image and signal processing.

Large-Scale Low-Rank Matrix Learning with Nonconvex Regularizers

no code implementations1 Aug 2017 Quanming Yao, James T. Kwok, Taifeng Wang, Tie-Yan Liu

Based on it, we develop a proximal gradient algorithm (and its accelerated variant) with inexact proximal splitting and prove that a convergence rate of O(1/T) where T is the number of iterations is guaranteed.

Matrix Completion

Scalable Online Convolutional Sparse Coding

no code implementations21 Jun 2017 Yaqing Wang, Quanming Yao, James T. Kwok, Lionel M. Ni

Convolutional sparse coding (CSC) improves sparse coding by learning a shift-invariant dictionary from the data.

Multi-Label Learning with Global and Local Label Correlation

no code implementations4 Apr 2017 Yue Zhu, James T. Kwok, Zhi-Hua Zhou

In fact, in the real-world applications, both cases may occur that some label correlations are globally applicable and some are shared only in a local group of instances.

Multi-Label Learning

Efficient Inexact Proximal Gradient Algorithm for Nonconvex Problems

no code implementations29 Dec 2016 Quanming Yao, James T. Kwok, Fei Gao, Wei Chen, Tie-Yan Liu

The proximal gradient algorithm has been popularly used for convex optimization.

Optimization and Control

Loss-aware Binarization of Deep Networks

1 code implementation5 Nov 2016 Lu Hou, Quanming Yao, James T. Kwok

Deep neural network models, though very powerful and highly successful, are computationally expensive in terms of space and time.

Binarization

Fast Learning with Nonconvex L1-2 Regularization

no code implementations29 Oct 2016 Quanming Yao, James T. Kwok, Xiawei Guo

In this paper, we show that a closed-form solution can be derived for the proximal step associated with this regularizer.

Sparse Learning

Efficient Learning with a Family of Nonconvex Regularizers by Redistributing Nonconvexity

no code implementations13 Jun 2016 Quanming Yao, James T. Kwok

The nonconvex regularizer is then transformed to a familiar convex regularizer, while the resultant loss function can still be guaranteed to be smooth.

Stochastic Variance-Reduced ADMM

no code implementations24 Apr 2016 Shuai Zheng, James T. Kwok

The alternating direction method of multipliers (ADMM) is a powerful optimization solver in machine learning.

Fast Nonsmooth Regularized Risk Minimization with Continuation

no code implementations25 Feb 2016 Shuai Zheng, Ruiliang Zhang, James T. Kwok

In regularized risk minimization, the associated optimization problem becomes particularly difficult when both the loss and regularizer are nonsmooth.

Asynchronous Distributed Semi-Stochastic Gradient Optimization

no code implementations7 Aug 2015 Ruiliang Zhang, Shuai Zheng, James T. Kwok

With the recent proliferation of large-scale learning problems, there have been a lot of interest on distributed machine learning algorithms, particularly those that are based on stochastic gradient descent (SGD) and its variants.

Cloud Computing

Fast Stochastic Alternating Direction Method of Multipliers

no code implementations16 Aug 2013 Leon Wenliang Zhong, James T. Kwok

This matches the convergence rate of the batch ADMM algorithm, but without the need to visit all the samples in each iteration.

Convex and Scalable Weakly Labeled SVMs

no code implementations6 Mar 2013 Yu-Feng Li, Ivor W. Tsang, James T. Kwok, Zhi-Hua Zhou

In this paper, we study the problem of learning from weakly labeled data, where labels of the training examples are incomplete.

Clustering Information Retrieval +1

Priors for Diversity in Generative Latent Variable Models

no code implementations NeurIPS 2012 James T. Kwok, Ryan P. Adams

We show how to perform MAP inference with DPP priors in latent Dirichlet allocation and in mixture models, leading to better intuition for the latent variable representation and quantitatively improved unsupervised feature extraction, without compromising the generative aspects of the model.

Mandatory Leaf Node Prediction in Hierarchical Multilabel Classification

no code implementations NeurIPS 2012 Wei Bi, James T. Kwok

However, while there have been a lot of MLNP methods in hierarchical multiclass classification, performing MLNP in hierarchical multilabel classification is much more difficult.

Classification General Classification

Accelerated Gradient Methods for Stochastic Optimization and Online Learning

no code implementations NeurIPS 2009 Chonghai Hu, Weike Pan, James T. Kwok

Regularized risk minimization often involves non-smooth optimization, either because of the loss function (e. g., hinge loss) or the regularizer (e. g., $\ell_1$-regularizer).

Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.