Code Search

48 papers with code • 5 benchmarks • 10 datasets

The goal of Code Search is to retrieve code fragments from a large code corpus that most closely match a developer’s intent, which is expressed in natural language.

Source: When Deep Learning Met Code Search

Benchmarks

Add a Result

These leaderboards are used to track progress in Code Search

Dataset	Best Model	Compare
CodeSearchNet	cpt-code M	See all
CoDesc	Self-attention	See all
CodeXGLUE - AdvTest	CodeT5+ 770M	See all
CodeSearchNet - Ruby	Uni-SBT	See all
CodeXGLUE - WebQueryTest	CodeBERT	See all

Libraries

Use these libraries to find Code Search models and implementations

microsoft/CodeBERT

5 papers

1,991

facebookresearch/CodeGen

2 papers

675

Datasets

Subtasks

Annotated Code Search

Latest papers

Most implemented Social Latest No code

CodeT5+: Open Code Large Language Models for Code Understanding and Generation

salesforce/codet5 • • 13 May 2023

To address these limitations, we propose ``CodeT5+'', a family of encoder-decoder LLMs for code in which component modules can be flexibly combined to suit a wide range of downstream code tasks.

2,597

13 May 2023

Paper
Code

The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation

fsoft-ai4code/thevault • 9 May 2023

We present The Vault, a dataset of high-quality code-text pairs in multiple programming languages for training large language models to understand and generate code.

09 May 2023

Paper
Code

Code Execution with Pre-trained Language Models

microsoft/CodeBERT • • 8 May 2023

Code execution is a fundamental aspect of programming language semantics that reflects the exact behavior of the code.

1,991

08 May 2023

Paper
Code

REINFOREST: Reinforcing Semantic Code Similarity for Cross-Lingual Code Search Models

reinforest-team/reinforest • • 5 May 2023

This paper introduces a novel code-to-code search technique that enhances the performance of Large Language Models (LLMs) by including both static and dynamic features as well as utilizing both similar and dissimilar examples during training.

05 May 2023

Paper
Code

One Adapter for All Programming Languages? Adapter Tuning for Code Search and Summarization

wangdeze18/multilingual-adapter-for-se • • 28 Mar 2023

To alleviate the potentially catastrophic forgetting issue in multilingual models, we fix all pre-trained model parameters, insert the parameter-efficient structure adapter, and fine-tune it.

28 Mar 2023

Paper
Code

Global Contrastive Batch Sampling via Optimization on Sample Permutations

vinayak1/gcbs • • 23 Oct 2022

Contrastive Learning has recently achieved state-of-the-art performance in a wide range of tasks.

23 Oct 2022

Paper
Code

Exploring Representation-Level Augmentation for Code Search

alex-haochenli/racs • • 21 Oct 2022

In this paper, we explore augmentation methods that augment data (both code and query) at representation level which does not require additional data processing and training, and based on this we propose a general format of representation-level augmentation that unifies existing methods.

21 Oct 2022

Paper
Code

XLCoST: A Benchmark Dataset for Cross-lingual Code Intelligence

reddy-lab-code-research/xlcost • • 16 Jun 2022

To the best of our knowledge, it is the largest parallel dataset for source code both in terms of size and the number of languages.

16 Jun 2022

Paper
Code

NS3: Neuro-Symbolic Semantic Code Search

shushanarakelyan/modular_code_search • • 21 May 2022

We compare our model - NS3 (Neuro-Symbolic Semantic Search) - to a number of baselines, including state-of-the-art semantic code retrieval methods, and evaluate on two datasets - CodeSearchNet and Code Search and Question Answering.

21 May 2022

Paper
Code

UniXcoder: Unified Cross-Modal Pre-training for Code Representation

microsoft/CodeBERT • • ACL 2022

Furthermore, we propose to utilize multi-modal contents to learn representation of code fragment with contrastive learning, and then align representations among programming languages using a cross-modal generation task.

1,991

08 Mar 2022

Paper
Code

Code Search

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result