Text Classification
1104 papers with code • 93 benchmarks • 148 datasets
Text Classification is the task of assigning a sentence or document an appropriate category. The categories depend on the chosen dataset and can range from topics.
Text Classification problems include emotion classification, news classification, citation intent classification, among others. Benchmark datasets for evaluating text classification capabilities include GLUE, AGNews, among others.
In recent years, deep learning techniques like XLNet and RoBERTa have attained some of the biggest performance jumps for text classification problems.
( Image credit: Text Classification Algorithms: A Survey )
Libraries
Use these libraries to find Text Classification models and implementationsSubtasks
- Topic Models
- Document Classification
- Sentence Classification
- Emotion Classification
- Emotion Classification
- Multi-Label Text Classification
- Few-Shot Text Classification
- Text Categorization
- Semi-Supervised Text Classification
- Coherence Evaluation
- Toxic Comment Classification
- Citation Intent Classification
- Cross-Domain Text Classification
- Unsupervised Text Classification
- Satire Detection
- Hierarchical Text Classification of Blurbs (GermEval 2019)
- Variable Detection
Most implemented papers
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.
Semi-supervised Sequence Learning
In our experiments, we find that long short term memory recurrent networks after being pretrained with the two approaches are more stable and generalize better.
Universal Language Model Fine-tuning for Text Classification
Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch.
Bag of Tricks for Efficient Text Classification
This paper explores a simple and efficient baseline for text classification.
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.
FastText.zip: Compressing text classification models
We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory.
Character-level Convolutional Networks for Text Classification
This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification.
Distributed Representations of Sentences and Documents
Its construction gives our algorithm the potential to overcome the weaknesses of bag-of-words models.
Revisiting Semi-Supervised Learning with Graph Embeddings
We present a semi-supervised learning framework based on graph embeddings.
Very Deep Convolutional Networks for Text Classification
The dominant approach for many NLP tasks are recurrent neural networks, in particular LSTMs, and convolutional neural networks.