16k

54 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

Efficiently Modeling Long Sequences with Structured State Spaces

hazyresearch/state-spaces ICLR 2022

A central goal of sequence modeling is designing a single principled model that can address sequence data across a range of modalities and tasks, particularly on long-range dependencies.

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

hazyresearch/flash-attention 27 May 2022

We also extend FlashAttention to block-sparse attention, yielding an approximate attention algorithm that is faster than any existing approximate attention method.

Long Range Arena: A Benchmark for Efficient Transformers

google-research/long-range-arena 8 Nov 2020

In the recent months, a wide spectrum of efficient, fast Transformers have been proposed to tackle this problem, more often than not claiming superior or comparable model quality to vanilla Transformer models.

Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset

google-research-datasets/dstc8-schema-guided-dialogue 12 Sep 2019

In this work, we introduce the the Schema-Guided Dialogue (SGD) dataset, containing over 16k multi-domain conversations spanning 16 domains.

Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society

firojalam/COVID-19-disinformation Findings (EMNLP) 2021

With the emergence of the COVID-19 pandemic, the political and the medical aspects of disinformation merged as the problem got elevated to a whole new level to become the first global infodemic.

An In-Depth Exploration of Person Re-Identification and Gait Recognition in Cloth-Changing Conditions

bnu-ivc/ccpg CVPR 2023

For the cloth-changing problem, video-based ReID is rarely studied due to the lack of a suitable cloth-changing benchmark, and gait recognition is often researched under controlled conditions.

Code Llama: Open Foundation Models for Code

facebookresearch/codellama 24 Aug 2023

We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks.

Long-form factuality in large language models

google-deepmind/long-form-factuality 27 Mar 2024

Empirically, we demonstrate that LLM agents can outperform crowdsourced human annotators - on a set of ~16k individual facts, SAFE agrees with crowdsourced human annotators 72% of the time, and on a random subset of 100 disagreement cases, SAFE wins 76% of the time.

Visual Semantic Role Labeling

s-gupta/v-coco 17 May 2015

In this paper we introduce the problem of Visual Semantic Role Labeling: given an image we want to detect people doing actions and localize the objects of interaction.

Deep Learning for Hate Speech Detection in Tweets

pinkeshbadjatiya/twitter-hatespeech 1 Jun 2017

Hate speech detection on Twitter is critical for applications like controversial event extraction, building AI chatterbots, content recommendation, and sentiment analysis.