Sign Language Recognition
67 papers with code • 10 benchmarks • 19 datasets
Sign Language Recognition is a computer vision and natural language processing task that involves automatically recognizing and translating sign language gestures into written or spoken language. The goal of sign language recognition is to develop algorithms that can understand and interpret sign language, enabling people who use sign language as their primary mode of communication to communicate more easily with non-signers.
( Image credit: Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison )
Datasets
Most implemented papers
Learning to Estimate 3D Hand Pose from Single RGB Images
Low-cost consumer depth cameras and deep learning have enabled reasonable 3D hand pose estimation from single depth images.
BlazePose: On-device Real-time Body Pose tracking
We present BlazePose, a lightweight convolutional neural network architecture for human pose estimation that is tailored for real-time inference on mobile devices.
A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation
Concretely, we pretrain the sign-to-gloss visual network on the general domain of human actions and the within-domain of a sign-to-gloss dataset, and pretrain the gloss-to-text translation network on the general domain of a multilingual corpus and the within-domain of a gloss-to-text corpus.
Skeleton Aware Multi-modal Sign Language Recognition
Sign language is commonly used by deaf or speech impaired people to communicate but requires significant effort to master.
Continuous Sign Language Recognition with Correlation Network
Visualizations demonstrate the effects of CorrNet on emphasizing human body trajectories across adjacent frames.
SubUNets: End-To-End Hand Shape and Continuous Sign Language Recognition
We propose a novel deep learning approach to solve simultaneous alignment and recognition problems (referred to as "Sequence-to-sequence" learning).
Fingerspelling recognition in the wild with iterative visual attention
In this paper we focus on recognition of fingerspelling sequences in American Sign Language (ASL) videos collected in the wild, mainly from YouTube and Deaf social media.
Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison
Based on this new large-scale dataset, we are able to experiment with several deep learning methods for word-level sign recognition and evaluate their performances in large scale scenarios.
TSPNet: Hierarchical Feature Learning via Temporal Semantic Pyramid for Sign Language Translation
Sign language translation (SLT) aims to interpret sign video sequences into text-based natural language sentences.
Context Matters: Self-Attention for Sign Language Recognition
For that reason, we apply attention to synchronize and help capture entangled dependencies between the different sign language components.