Search Results for author: Pan Zhang

Found 54 papers, 30 papers with code

Long-CLIP: Unlocking the Long-Text Capability of CLIP

1 code implementation22 Mar 2024 Beichen Zhang, Pan Zhang, Xiaoyi Dong, Yuhang Zang, Jiaqi Wang

Contrastive Language-Image Pre-training (CLIP) has been the cornerstone for zero-shot classification, text-image retrieval, and text-image generation by aligning image and text modalities.

Image Retrieval Language Modelling +3

RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition

1 code implementation20 Mar 2024 Ziyu Liu, Zeyi Sun, Yuhang Zang, Wei Li, Pan Zhang, Xiaoyi Dong, Yuanjun Xiong, Dahua Lin, Jiaqi Wang

Notably, our approach demonstrates a significant improvement in performance on 5 fine-grained visual recognition benchmarks, 11 few-shot image recognition datasets, and the 2 object detection datasets under the zero-shot recognition setting.

Contrastive Learning Fine-Grained Visual Recognition +3

DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models

1 code implementation22 Feb 2024 Yuhang Cao, Pan Zhang, Xiaoyi Dong, Dahua Lin, Jiaqi Wang

We present DualFocus, a novel framework for integrating macro and micro perspectives within multi-modal large language models (MLLMs) to enhance vision-language task performance.

Hallucination

Concealed Object Segmentation with Hierarchical Coherence Modeling

no code implementations22 Jan 2024 Fengyang Xiao, Pan Zhang, Chunming He, Runze Hu, Yutao Liu

Concealed object segmentation (COS) is a challenging task that involves localizing and segmenting those concealed objects that are visually blended with their surrounding environments.

Image Segmentation Object +5

HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image

no code implementations7 Dec 2023 Tong Wu, Zhibing Li, Shuai Yang, Pan Zhang, Xinggang Pan, Jiaqi Wang, Dahua Lin, Ziwei Liu

Extensive experiments demonstrate the effectiveness of HyperDreamer in modeling region-aware materials with high-resolution textures and enabling user-friendly editing.

Semantic Segmentation

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

1 code implementation6 Dec 2023 Zeyi Sun, Ye Fang, Tong Wu, Pan Zhang, Yuhang Zang, Shu Kong, Yuanjun Xiong, Dahua Lin, Jiaqi Wang

Alpha-CLIP not only preserves the visual recognition ability of CLIP but also enables precise control over the emphasis of image contents.

OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

1 code implementation29 Nov 2023 Qidong Huang, Xiaoyi Dong, Pan Zhang, Bin Wang, Conghui He, Jiaqi Wang, Dahua Lin, Weiming Zhang, Nenghai Yu

Based on the observation, OPERA introduces a penalty term on the model logits during the beam-search decoding to mitigate the over-trust issue, along with a rollback strategy that retrospects the presence of summary tokens in the previously generated tokens, and re-allocate the token selection if necessary.

Hallucination

ShareGPT4V: Improving Large Multi-Modal Models with Better Captions

1 code implementation21 Nov 2023 Lin Chen, Jinsong Li, Xiaoyi Dong, Pan Zhang, Conghui He, Jiaqi Wang, Feng Zhao, Dahua Lin

In the realm of large multi-modal models (LMMs), efficient modality alignment is crucial yet often constrained by the scarcity of high-quality image-text data.

Descriptive visual instruction following +2

MLLM-DataEngine: An Iterative Refinement Approach for MLLM

1 code implementation25 Aug 2023 Zhiyuan Zhao, Linke Ouyang, Bin Wang, Siyuan Huang, Pan Zhang, Xiaoyi Dong, Jiaqi Wang, Conghui He

Despite the great advance of Multimodal Large Language Models (MLLMs) in both instruction dataset building and benchmarking, the independence of training and evaluation makes current MLLMs hard to further improve their capability under the guidance of evaluation results with a relatively low human cost.

Benchmarking

VIGC: Visual Instruction Generation and Correction

2 code implementations24 Aug 2023 Bin Wang, Fan Wu, Xiao Han, Jiahui Peng, Huaping Zhong, Pan Zhang, Xiaoyi Dong, Weijia Li, Wei Li, Jiaqi Wang, Conghui He

A practical solution to this problem would be to utilize the available multimodal large language models (MLLMs) to generate instruction data for vision-language tasks.

Hallucination Image Captioning +1

HGDNet: A Height-Hierarchy Guided Dual-Decoder Network for Single View Building Extraction and Height Estimation

no code implementations10 Aug 2023 Chaoran Lu, Ningning Cao, Pan Zhang, Ting Liu, Baochai Peng, Guozhang Liu, Mengke Yuan, Sen Zhang, Simin Huang, Tao Wang

Unifying the correlative single-view satellite image building extraction and height estimation tasks indicates a promising way to share representations and acquire generalist model for large-scale urban 3D reconstruction.

3D Reconstruction

Fine-grained building roof instance segmentation based on domain adapted pretraining and composite dual-backbone

no code implementations10 Aug 2023 Guozhang Liu, Baochai Peng, Ting Liu, Pan Zhang, Mengke Yuan, Chaoran Lu, Ningning Cao, Sen Zhang, Simin Huang, Tao Wang

The diversity of building architecture styles of global cities situated on various landforms, the degraded optical imagery affected by clouds and shadows, and the significant inter-class imbalance of roof types pose challenges for designing a robust and accurate building roof instance segmentor.

Data Augmentation Instance Segmentation +1

qecGPT: decoding Quantum Error-correcting Codes with Generative Pre-trained Transformers

1 code implementation18 Jul 2023 Hanyan Cao, Feng Pan, Yijia Wang, Pan Zhang

Our framework is general and can be applied to any error model and quantum codes with different topologies such as surface codes and quantum LDPC codes.

FreeDrag: Feature Dragging for Reliable Point-based Image Editing

1 code implementation10 Jul 2023 Pengyang Ling, Lin Chen, Pan Zhang, Huaian Chen, Yi Jin, Jinjin Zheng

To serve the intricate and varied demands of image editing, precise and flexible manipulation in image content is indispensable.

Point Tracking

V3Det: Vast Vocabulary Visual Detection Dataset

no code implementations ICCV 2023 Jiaqi Wang, Pan Zhang, Tao Chu, Yuhang Cao, Yujie Zhou, Tong Wu, Bin Wang, Conghui He, Dahua Lin

2) Hierarchical Category Organization: The vast vocabulary of V3Det is organized by a hierarchical category tree which annotates the inclusion relationship among categories, encouraging the exploration of category relationships in vast and open vocabulary object detection.

Chatbot Object +2

Neural-network solutions to stochastic reaction networks

1 code implementation29 Sep 2022 Ying Tang, Jiayu Weng, Pan Zhang

The stochastic reaction network in which chemical species evolve through a set of reactions is widely used to model stochastic processes in physics, chemistry and biology.

Real-Time Neural Character Rendering with Pose-Guided Multiplane Images

1 code implementation25 Apr 2022 Hao Ouyang, Bo Zhang, Pan Zhang, Hao Yang, Jiaolong Yang, Dong Chen, Qifeng Chen, Fang Wen

We propose pose-guided multiplane image (MPI) synthesis which can render an animatable character in real scenes with photorealistic quality.

Image-to-Image Translation Neural Rendering +1

Semi-Supervised Image-to-Image Translation using Latent Space Mapping

no code implementations29 Mar 2022 Pan Zhang, Jianmin Bao, Ting Zhang, Dong Chen, Fang Wen

Thanks to the low dimensional feature space, it is easier to find the desired mapping function, resulting in improved quality of translation results as well as the stability of the translation model.

Image-to-Image Translation Translation

Tensor networks for unsupervised machine learning

1 code implementation24 Jun 2021 Jing Liu, Sujie Li, Jiang Zhang, Pan Zhang

Despite the great potential, however, existing tensor network models for unsupervised machine learning only work as a proof of principle, as their performance is much worse than the standard models such as restricted Boltzmann machines and neural networks.

BIG-bench Machine Learning Tensor Networks

Robust Mutual Learning for Semi-supervised Semantic Segmentation

no code implementations1 Jun 2021 Pan Zhang, Bo Zhang, Ting Zhang, Dong Chen, Fang Wen

The proposed robust mutual learning demonstrates state-of-the-art performance on semantic segmentation in low-data regime.

Pseudo Label Semi-Supervised Semantic Segmentation

Boltzmann machines as two-dimensional tensor networks

no code implementations10 May 2021 Sujie Li, Feng Pan, Pengfei Zhou, Pan Zhang

Using numerical experiments, we demonstrate that the proposed algorithm is much more accurate than the state-of-the-art machine learning methods in estimating the partition function of restricted Boltzmann machines and deep Boltzmann machines, and have potential applications in training deep Boltzmann machines for general machine learning tasks.

BIG-bench Machine Learning Tensor Networks +1

Old Photo Restoration via Deep Latent Space Translation

8 code implementations14 Sep 2020 Zi-Yu Wan, Bo Zhang, Dong-Dong Chen, Pan Zhang, Dong Chen, Jing Liao, Fang Wen

Unlike conventional restoration tasks that can be solved through supervised learning, the degradation in real photos is complex and the domain gap between synthetic images and real old photos makes the network fail to generalize.

Image Restoration Translation

Supervised Learning with Projected Entangled Pair States

no code implementations12 Sep 2020 Song Cheng, Lei Wang, Pan Zhang

Tensor networks, a model that originated from quantum physics, has been gradually generalized as efficient models in machine learning in recent years.

BIG-bench Machine Learning Tensor Networks

Contact Area Detector using Cross View Projection Consistency for COVID-19 Projects

no code implementations18 Aug 2020 Pan Zhang, Wilfredo Torres Calderon, Bokyung Lee, Alex Tessier, Jacky Bibliowicz, Liviu Calin, Michael Lee

Instead of doing 3D scene reconstruction or transfer learning from deep networks, a mapping from the surface in the two camera views to the surface space is the only requirement.

3D Reconstruction 3D Scene Reconstruction +2

Tropical Tensor Network for Ground States of Spin Glasses

1 code implementation16 Aug 2020 Jin-Guo Liu, Lei Wang, Pan Zhang

We present a unified exact tensor network approach to compute the ground state energy, identify the optimal configuration, and count the number of solutions for spin glasses.

Statistical Mechanics Quantum Physics Computation

Bringing Old Photos Back to Life

7 code implementations CVPR 2020 Zi-Yu Wan, Bo Zhang, Dong-Dong Chen, Pan Zhang, Dong Chen, Jing Liao, Fang Wen

Unlike conventional restoration tasks that can be solved through supervised learning, the degradation in real photos is complex and the domain gap between synthetic images and real old photos makes the network fail to generalize.

Image Restoration Translation

Solving Quantum Statistical Mechanics with Variational Autoregressive Networks and Quantum Circuits

1 code implementation24 Dec 2019 Jin-Guo Liu, Liang Mao, Pan Zhang, Lei Wang

We extend the ability of unitary quantum circuits by interfacing it with classical autoregressive neural networks.

Quantum Physics

Yao.jl: Extensible, Efficient Framework for Quantum Algorithm Design

1 code implementation23 Dec 2019 Xiu-Zhe Luo, Jin-Guo Liu, Pan Zhang, Lei Wang

We introduce Yao, an extensible, efficient open-source framework for quantum algorithm design.

Quantum Physics Strongly Correlated Electrons Computational Physics

Contracting Arbitrary Tensor Networks: general approximate algorithm and applications in graphical models and quantum circuit simulations

1 code implementation6 Dec 2019 Feng Pan, Pengfei Zhou, Sujie Li, Pan Zhang

We present a general method for approximately contracting tensor networks with an arbitrary connectivity.

Computational Physics Statistical Mechanics Strongly Correlated Electrons Quantum Physics

Phase transitions and optimal algorithms for semi-supervised classifications on graphs: from belief propagation to graph convolution network

no code implementations1 Nov 2019 Pengfei Zhou, Tianyi Li, Pan Zhang

For the first time, well-controlled benchmark datasets with asymptotially exact properties and optimal solutions could be produced for the evaluation of graph convolution neural networks, and for the theoretical understanding of their strengths and weaknesses.

Bayesian Inference Clustering +2

A streaming feature-based compression method for data from instrumented infrastructure

no code implementations12 Apr 2019 Alastair Gregory, Din-Houn Lau, Alex Tessier, Pan Zhang

An increasing amount of civil engineering applications are utilising data acquired from infrastructure instrumented with sensing devices.

Tree Tensor Networks for Generative Modeling

no code implementations8 Jan 2019 Song Cheng, Lei Wang, Tao Xiang, Pan Zhang

Matrix product states (MPS), a tensor network designed for one-dimensional quantum systems, has been recently proposed for generative modeling of natural data (such as images) in terms of `Born machine'.

BIG-bench Machine Learning Tensor Networks

Shortcut Matrix Product States and its applications

no code implementations13 Dec 2018 Zhuan Li, Pan Zhang

Matrix Product States (MPS), also known as Tensor Train (TT) decomposition in mathematics, has been proposed originally for describing an (especially one-dimensional) quantum system, and recently has found applications in various applications such as compressing high-dimensional data, supervised kernel linear classifier, and unsupervised generative modeling.

Computational Efficiency

Weighted Community Detection and Data Clustering Using Message Passing

no code implementations30 Jan 2018 Cheng Shi, Yanchen Liu, Pan Zhang

In the community detection problem in weighted and directed networks, we show that our algorithm significantly outperforms existing algorithms.

Bayesian Inference Clustering +1

Spectral estimation of the percolation transition in clustered networks

no code implementations4 Oct 2017 Pan Zhang

There have been several spectral bounds for the percolation transition in networks, using spectrum of matrices associated with the network such as the adjacency matrix and the non-backtracking matrix.

Clustering

Unsupervised Generative Modeling Using Matrix Product States

1 code implementation6 Sep 2017 Zhao-Yu Han, Jun Wang, Heng Fan, Lei Wang, Pan Zhang

Generative modeling, which learns joint probability distribution from data and generates samples according to it, is an important task in machine learning and artificial intelligence.

BIG-bench Machine Learning

Evaluating accuracy of community detection using the relative normalized mutual information

no code implementations15 Jan 2015 Pan Zhang

The Normalized Mutual Information (NMI) has been widely used to evaluate the accuracy of community detection algorithms.

Community Detection

Phase transitions in semisupervised clustering of sparse networks

no code implementations30 Apr 2014 Pan Zhang, Cristopher Moore, Lenka Zdeborová

For larger $k$ where a hard but detectable regime exists, we find that the easy/hard transition (the point at which efficient algorithms can do better than chance) becomes a line of transitions where the accuracy jumps discontinuously at a critical value of $\alpha$.

Clustering Stochastic Block Model

Scalable detection of statistically significant communities and hierarchies, using message-passing for modularity

1 code implementation23 Mar 2014 Pan Zhang, Cristopher Moore

We address this problem by using the modularity as a Hamiltonian at finite temperature, and using an efficient Belief Propagation algorithm to obtain the consensus of many partitions with high modularity, rather than looking for a single partition that maximizes it.

Stochastic Block Model

Model Selection for Degree-corrected Block Models

no code implementations17 Jul 2012 Xiaoran Yan, Cosma Rohilla Shalizi, Jacob E. Jensen, Florent Krzakala, Cristopher Moore, Lenka Zdeborova, Pan Zhang, Yaojia Zhu

We present the first principled and tractable approach to model selection between standard and degree-corrected block models, based on new large-graph asymptotics for the distribution of log-likelihood ratios under the stochastic block model, finding substantial departures from classical results for sparse graphs.

Model Selection Stochastic Block Model

Cannot find the paper you are looking for? You can Submit a new open access paper.