Granite Code Models: A Family of Open Foundation Models for Code Intelligence

ibm-granite/granite-code-models 7 May 2024

Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously.

Code Generation Decoder

763
0.49 stars / hour

EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training

leaplabthu/efficienttrain 14 May 2024

These patterns, when observed through frequency and spatial domains, incorporate lower-frequency components, and the natural image contents without distortion or data augmentation.

Data Augmentation Self-Supervised Learning

102
0.49 stars / hour

SceneTracker: Long-term Scene Flow Estimation Network

wwsource/scenetracker 29 Mar 2024

Considering the complementarity of scene flow estimation in the spatial domain's focusing capability and 3D object tracking in the temporal domain's coherence, this study aims to address a comprehensive new task that can simultaneously capture fine-grained and long-term 3D motion in an online manner: long-term scene flow estimation (LSFE).

3D Object Tracking Object Tracking +1

77
0.48 stars / hour

VILA: On Pre-training for Visual Language Models

efficient-large-model/vila 12 Dec 2023

Visual language models (VLMs) rapidly progressed with the recent success of large language models.

In-Context Learning Language Modelling +2

677
0.46 stars / hour

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

tencentarc/instantmesh 10 Apr 2024

We present InstantMesh, a feed-forward framework for instant 3D mesh generation from a single image, featuring state-of-the-art generation quality and significant training scalability.

Image to 3D

2,109
0.44 stars / hour

WavCraft: Audio Editing and Generation with Large Language Models

jinhualiang/wavcraft 14 Mar 2024

We introduce WavCraft, a collective system that leverages large language models (LLMs) to connect diverse task-specific models for audio content creation and editing.

In-Context Learning

426
0.41 stars / hour

Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model

xiaoduoailab/xmodelvlm 15 May 2024

We introduce Xmodel-VLM, a cutting-edge multimodal vision language model.

Language Modelling

29
0.40 stars / hour

Vidur: A Large-Scale Simulation Framework For LLM Inference

microsoft/vidur 8 May 2024

Vidur models the performance of LLM operators using a combination of experimental profiling and predictive modeling, and evaluates the end-to-end inference performance for different workloads by estimating several metrics of interest such as latency and throughput.

Scheduling

62
0.38 stars / hour

PHUDGE: Phi-3 as Scalable Judge

deshwalmahesh/PHUDGE 12 May 2024

In this paper cum technical report, we present PHUDGE A fine tuned Phi3 model that achieved SOTA results in 4 tasks as Feedback Test, Feedback OOD, MT Human, Preference Test surpassing each and every existing model in latency and throughput.

Data Augmentation

31
0.38 stars / hour

Improving Diffusion Models for Virtual Try-on

yisol/IDM-VTON 8 Mar 2024

Finally, we present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.

Virtual Try-on

2,452
0.37 stars / hour