Multimodal Sentiment Analysis
73 papers with code • 5 benchmarks • 7 datasets
Multimodal sentiment analysis is the task of performing sentiment analysis with multiple data sources - e.g. a camera feed of someone's face and their recorded speech.
( Image credit: ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection )
Libraries
Use these libraries to find Multimodal Sentiment Analysis models and implementationsDatasets
Most implemented papers
Multimodal Speech Emotion Recognition Using Audio and Text
Speech emotion recognition is a challenging task, and extensive reliance has been placed on models that use audio features in building well-performing classifiers.
Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors
Humans convey their intentions through the usage of both verbal and nonverbal behaviors during face-to-face communication.
Multimodal Transformer for Unaligned Multimodal Language Sequences
Human language is often multimodal, which comprehends a mixture of natural language, facial gestures, and acoustic behaviors.
Efficient Low-rank Multimodal Fusion with Modality-Specific Factors
Previous research in this field has exploited the expressiveness of tensors for multimodal representation.
Complementary Fusion of Multi-Features and Multi-Modalities in Sentiment Analysis
Therefore, in this paper, based on audio and text, we consider the task of multimodal sentiment analysis and propose a novel fusion strategy including both multi-feature fusion and multi-modality fusion to improve the accuracy of audio-text sentiment analysis.
M-SENA: An Integrated Platform for Multimodal Sentiment Analysis
The platform features a fully modular video sentiment analysis framework consisting of data management, feature extraction, model training, and result analysis modules.
Context-Dependent Sentiment Analysis in User-Generated Videos
Multimodal sentiment analysis is a developing area of research, which involves the identification of sentiments in videos.
Multimodal Sentiment Analysis with Word-Level Fusion and Reinforcement Learning
In this paper, we propose the Gated Multimodal Embedding LSTM with Temporal Attention (GME-LSTM(A)) model that is composed of 2 modules.
Multi-attention Recurrent Network for Human Communication Comprehension
AI must understand each modality and the interactions between them that shape human communication.
Multimodal Sentiment Analysis To Explore the Structure of Emotions
We propose a novel approach to multimodal sentiment analysis using deep neural networks combining visual analysis and natural language processing.