Code Search
49 papers with code • 5 benchmarks • 10 datasets
The goal of Code Search is to retrieve code fragments from a large code corpus that most closely match a developer’s intent, which is expressed in natural language.
Libraries
Use these libraries to find Code Search models and implementationsDatasets
Latest papers with no code
Neuro-symbolic Zero-Shot Code Cloning with Cross-Language Intermediate Representation
We further fine-tune UnixCoder, the best-performing model for zero-shot cross-programming language code search, for the Code Cloning task with the SBT IRs of C code-pairs, available in the CodeNet dataset.
Unveiling Code Pre-Trained Models: Investigating Syntax and Semantics Capacities
These structures are fundamental to understanding code.
You Don't Know Search: Helping Users Find Code by Automatically Evaluating Alternative Queries
Our main result shows that relative to the control group, users are on average 22% more likely to click on a search result at all on any given day when AQE is active.
ContraCLM: Contrastive Learning For Causal Language Model
Specifically, we attain $44\%$ relative improvement on the Semantic Textual Similarity tasks and $34\%$ on Code-to-Code Search tasks.
CodeDSI: Differentiable Code Search
In an effort to improve the performance of code search, we have investigated docid representation strategies, impact of tokenization on docid structure, and dataset sizes on overall code search performance.
CSSAM:Code Search via Attention Matching of Code Semantics and Structures
By leveraging the residual interaction, a matching module is designed to preserve more code semantics and descriptive features, that enhances the adhesion between the code and its corresponding query text.
CoCoSoDa: Effective Contrastive Learning for Code Search
However, there is still a lot of room for improvement in using contrastive learning for code search.
On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages
Furthermore, some programming languages are inherently different and code written in one language usually cannot be interchanged with the others, i. e., Ruby and Java code possess very different structure.
Accelerating Code Search with Deep Hashing and Code Classification
Code search is to search reusable code snippets from source code corpus based on natural languages queries.
AstBERT: Enabling Language Model for Financial Code Understanding with Abstract Syntax Trees
Specifically, we collect a sheer number of source codes (both Java and Python) from the Alipay code repository and incorporate both syntactic and semantic code knowledge into our model through the help of code parsers, in which AST information of the source codes can be interpreted and integrated.