Trending Research

UFO: A UI-Focused Agent for Windows OS Interaction

microsoft/UFO • 8 Feb 2024

We introduce UFO, an innovative UI-Focused agent to fulfill user requests tailored to applications on Windows OS, harnessing the capabilities of GPT-Vision.

Navigate

4,089

0.40 stars / hour

Paper
Code

RecAI: Leveraging Large Language Models for Next-Generation Recommender Systems

microsoft/recai • • 11 Mar 2024

This paper introduces RecAI, a practical toolkit designed to augment or even revolutionize recommender systems with the advanced capabilities of Large Language Models (LLMs).

Recommendation Systems

319

0.39 stars / hour

Paper
Code

FinLangNet: A Novel Deep Learning Framework for Credit Risk Prediction Using Linguistic Analogy in Financial Data

leiyu0210/finlangnet • • 19 Apr 2024

Our research demonstrates that FinLangNet surpasses traditional statistical methods in predicting credit risk and that its integration with these methods enhances credit card fraud prediction models, achieving a significant improvement of over 1. 5 points in the Kolmogorov-Smirnov metric.

0.39 stars / hour

Paper
Code

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

openbmb/omnilmm • • 18 Mar 2024

To address the challenges, we present LLaVA-UHD, a large multimodal model that can efficiently perceive images in any aspect ratio and high resolution.

1,312

0.38 stars / hour

Paper
Code

MAexp: A Generic Platform for RL-based Multi-Agent Exploration

duangzhu/maexp • • 19 Apr 2024

The sim-to-real gap poses a significant challenge in RL-based multi-agent exploration due to scene quantization and action discretization.

Multi-agent Reinforcement Learning Quantization

0.37 stars / hour

Paper
Code

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Leeroo-AI/mergoo • • 12 Mar 2024

We investigate efficient methods for training Large Language Models (LLMs) to possess capabilities in multiple specialized domains, such as coding, math reasoning and world knowledge.

Ranked #30 on Question Answering on TriviaQA

Arithmetic Reasoning Code Generation +6

233

0.36 stars / hour

Paper
Code

Code Llama: Open Foundation Models for Code

facebookresearch/codellama • • 24 Aug 2023

We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks.

Ranked #27 on Code Generation on MBPP

16k Code Generation +1

14,964

0.35 stars / hour

Paper
Code

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

bytedance/MoMA • • 8 Apr 2024

This approach effectively synergizes reference image and text prompt information to produce valuable image features, facilitating an image diffusion model.

Image-to-Image Translation Language Modelling +1

0.34 stars / hour

Paper
Code

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models

FoundationVision/Groma • • 19 Apr 2024

We introduce Groma, a Multimodal Large Language Model (MLLM) with grounded and fine-grained visual perception ability.

Language Modelling Large Language Model +2

131

0.34 stars / hour

Paper
Code

LOHO: Latent Optimization of Hairstyles via Orthogonalization

dukebw/LOHO • • CVPR 2021

Therefore, we propose Latent Optimization of Hairstyles via Orthogonalization (LOHO), an optimization-based approach using GAN inversion to infill missing hair structure details in latent space during hairstyle transfer.

SSIM

196

0.33 stars / hour

Paper
Code