Trending Research

Rewrite the Stars

ma-xu/rewrite-the-stars • • 29 Mar 2024

Recent studies have drawn attention to the untapped potential of the "star operation" (element-wise multiplication) in network design.

119

0.50 stars / hour

Paper
Code

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

mit-han-lab/qserve • • 7 May 2024

The key insight driving QServe is that the efficiency of LLM serving on GPUs is critically influenced by operations on low-throughput CUDA cores.

Language Modelling Large Language Model +1

206

0.49 stars / hour

Paper
Code

DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

zzzhang-jx/docres • • 7 May 2024

This underscores the potential of DocRes across a broader spectrum of document image restoration tasks.

Binarization Deblurring +3

139

0.47 stars / hour

Paper
Code

WavCraft: Audio Editing and Generation with Large Language Models

jinhualiang/wavcraft • • 14 Mar 2024

We introduce WavCraft, a collective system that leverages large language models (LLMs) to connect diverse task-specific models for audio content creation and editing.

In-Context Learning

384

0.45 stars / hour

Paper
Code

FeatUp: A Model-Agnostic Framework for Features at Any Resolution

mhamilton723/FeatUp • • 15 Mar 2024

Deep features are a cornerstone of computer vision research, capturing image semantics and enabling the community to solve downstream tasks even in the zero- or few-shot regime.

Ranked #1 on Feature Upsampling on ImageNet

Depth Estimation Depth Prediction +5

1,182

0.44 stars / hour

Paper
Code

PuLID: Pure and Lightning ID Customization via Contrastive Alignment

tothebeginning/pulid • • 24 Apr 2024

We propose Pure and Lightning ID customization (PuLID), a novel tuning-free ID customization method for text-to-image generation.

Text-to-Image Generation

716

0.43 stars / hour

Paper
Code

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

mcgill-nlp/llm2vec • • 9 Apr 2024

We outperform encoder-only models by a large margin on word-level tasks and reach a new unsupervised state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB).

Contrastive Learning Decoder

662

0.43 stars / hour

Paper
Code

ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars

3DTopia/ThemeStation • • 22 Mar 2024

To this end, we design a two-stage framework that draws a concept image first, followed by a reference-informed 3D modeling stage.

3D Generation Unity

149

0.42 stars / hour

Paper
Code

Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models

assafelovic/gpt-researcher • 22 Feb 2024

We study how to apply large language models to write grounded and organized long-form articles from scratch, with comparable breadth and depth to Wikipedia pages.

Retrieval

10,610

0.42 stars / hour

Paper
Code

AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into One

nvlabs/radio • • 10 Dec 2023

A handful of visual foundation models (VFMs) have recently emerged as the backbones for numerous downstream tasks.

Benchmarking object-detection +2

380

0.41 stars / hour

Paper
Code