Code Generation
331 papers with code • 15 benchmarks • 43 datasets
Code Generation is an important field to predict explicit code or program structure from multimodal data sources such as incomplete code, programs in another programming language, natural language descriptions or execution examples. Code Generation tools can assist the development of automatic programming tools to improve programming productivity.
Source: Deep Learning for Source Code Modeling and Generation
Image source: Measuring Coding Challenge Competence With APPS
Libraries
Use these libraries to find Code Generation models and implementationsSubtasks
Latest papers
Towards smaller, faster decoder-only transformers: Architectural variants and their implications
Research on Large Language Models (LLMs) has recently seen exponential growth, largely focused on transformer-based architectures, as introduced by [1] and further advanced by the decoder-only variations in [2].
MMCode: Evaluating Multi-Modal Code Large Language Models with Visually Rich Programming Problems
Programming often involves converting detailed and complex specifications into code, a process during which developers typically utilize visual aids to more effectively convey concepts.
Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective
In this paper, we suggest that code comments are the natural logic pivot between natural language and code language and propose using comments to boost the code generation ability of code LLMs.
Analyzing the Performance of Large Language Models on Code Summarization
We show that for the task of code summarization, the performance of these models on individual examples often depends on the amount of (subword) token overlap between the code and the corresponding reference natural language descriptions in the dataset.
Xiwu: A Basis Flexible and Learnable LLM for High Energy Physics
To address this challenge, a sophisticated large language model system named as Xiwu has been developed, allowing you switch between the most advanced foundation models and quickly teach the model domain knowledge.
Advancing LLM Reasoning Generalists with Preference Trees
We introduce Eurus, a suite of large language models (LLMs) optimized for reasoning.
Self-Organized Agents: A LLM Multi-Agent Framework toward Ultra Large-Scale Code Generation and Optimization
To tackle this challenge, we propose Self-Organized multi-Agent framework (SoA), a novel multi-agent framework that enables the scalable and efficient generation and optimization of large-scale code.
EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories
Existing benchmarks demonstrate poor alignment with real-world code repositories and are insufficient to evaluate the coding abilities of LLMs.
Top Leaderboard Ranking = Top Coding Proficiency, Always? EvoEval: Evolving Coding Benchmarks via LLM
Such limitations inevitably lead us to inquire: Is the leaderboard performance on existing benchmarks reliable and comprehensive enough to measure the program synthesis ability of LLMs?
CYCLE: Learning to Self-Refine the Code Generation
Pre-trained code language models have achieved promising performance in code generation and improved the programming efficiency of human developers.