16k
55 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in 16k
Most implemented papers
Efficiently Modeling Long Sequences with Structured State Spaces
A central goal of sequence modeling is designing a single principled model that can address sequence data across a range of modalities and tasks, particularly on long-range dependencies.
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
We also extend FlashAttention to block-sparse attention, yielding an approximate attention algorithm that is faster than any existing approximate attention method.
Long Range Arena: A Benchmark for Efficient Transformers
In the recent months, a wide spectrum of efficient, fast Transformers have been proposed to tackle this problem, more often than not claiming superior or comparable model quality to vanilla Transformer models.
Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset
In this work, we introduce the the Schema-Guided Dialogue (SGD) dataset, containing over 16k multi-domain conversations spanning 16 domains.
Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society
With the emergence of the COVID-19 pandemic, the political and the medical aspects of disinformation merged as the problem got elevated to a whole new level to become the first global infodemic.
An In-Depth Exploration of Person Re-Identification and Gait Recognition in Cloth-Changing Conditions
For the cloth-changing problem, video-based ReID is rarely studied due to the lack of a suitable cloth-changing benchmark, and gait recognition is often researched under controlled conditions.
Code Llama: Open Foundation Models for Code
We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks.
Long-form factuality in large language models
Empirically, we demonstrate that LLM agents can outperform crowdsourced human annotators - on a set of ~16k individual facts, SAFE agrees with crowdsourced human annotators 72% of the time, and on a random subset of 100 disagreement cases, SAFE wins 76% of the time.
Visual Semantic Role Labeling
In this paper we introduce the problem of Visual Semantic Role Labeling: given an image we want to detect people doing actions and localize the objects of interaction.
Deep Learning for Hate Speech Detection in Tweets
Hate speech detection on Twitter is critical for applications like controversial event extraction, building AI chatterbots, content recommendation, and sentiment analysis.