Trending Research

The Platonic Representation Hypothesis

minyoungg/platonic-rep • • 13 May 2024

We argue that representations in AI models, particularly deep networks, are converging.

188

0.72 stars / hour

Paper
Code

MarkLLM: An Open-Source Toolkit for LLM Watermarking

thu-bpm/markllm • • 16 May 2024

However, the abundance of LLM watermarking algorithms, their intricate mechanisms, and the complex evaluation procedures and perspectives pose challenges for researchers and the community to easily experiment with, understand, and assess the latest advancements.

0.72 stars / hour

Paper
Code

MambaOut: Do We Really Need Mamba for Vision?

yuweihao/mambaout • • 13 May 2024

For vision tasks, as image classification does not align with either characteristic, we hypothesize that Mamba is not necessary for this task; Detection and segmentation tasks are also not autoregressive, yet they adhere to the long-sequence characteristic, so we believe it is still worthwhile to explore Mamba's potential for these tasks.

Image Classification Instance Segmentation +2

1,588

0.71 stars / hour

Paper
Code

UFO: A UI-Focused Agent for Windows OS Interaction

microsoft/UFO • 8 Feb 2024

We introduce UFO, an innovative UI-Focused agent to fulfill user requests tailored to applications on Windows OS, harnessing the capabilities of GPT-Vision.

Navigate

4,998

0.69 stars / hour

Paper
Code

LocoMuJoCo: A Comprehensive Imitation Learning Benchmark for Locomotion

robfiras/loco-mujoco • 4 Nov 2023

Imitation Learning (IL) holds great promise for enabling agile locomotion in embodied agents.

Benchmarking Imitation Learning

387

0.68 stars / hour

Paper
Code

VILA: On Pre-training for Visual Language Models

efficient-large-model/vila • • 12 Dec 2023

Visual language models (VLMs) rapidly progressed with the recent success of large language models.

Ranked #24 on Visual Question Answering on MM-Vet

In-Context Learning Language Modelling +2

744

0.67 stars / hour

Paper
Code

From Sora What We Can See: A Survey of Text-to-Video Generation

soraw-ai/awesome-text-to-video-generation • • 17 May 2024

With impressive achievements made, artificial intelligence is on the path forward to artificial general intelligence.

Text-to-Video Generation Video Generation

0.59 stars / hour

Paper
Code

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

kongds/mora • • 20 May 2024

Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models.

Continual Pretraining Mathematical Reasoning

0.54 stars / hour

Paper
Code

EasySpider: A No-Code Visual System for Crawling the Web

NaiboWang/EasySpider • ACM The Web Conference 2023

As such, web-crawling is an essential tool for both computational and non-computational scientists to conduct research.

Data Integration Marketing

24,708

0.53 stars / hour

Paper
Code

WavCraft: Audio Editing and Generation with Large Language Models

jinhualiang/wavcraft • • 14 Mar 2024

We introduce WavCraft, a collective system that leverages large language models (LLMs) to connect diverse task-specific models for audio content creation and editing.

In-Context Learning

477

0.51 stars / hour

Paper
Code