Text Classification
Text Classification is a core task in natural language processing, aimed at categorizing text data into predefined categories. This task achieves efficient information organization and retrieval by analyzing the content of the text and identifying its features such as topic, sentiment, or intent. In recent years, deep learning models like XLNet and RoBERTa have significantly improved the performance of text classification, driving technological advancements. Benchmark datasets such as GLUE and AGNews are widely used to evaluate the effectiveness of these models.
MTEB
ST5-XXL
AG News
DBpedia
XLNet
R8
RoBERTaGCN
TREC-6
Automatic Label Error Correction
20NEWS
RoBERTaGCN
UK Key Stage Readability
MR
Ohsumed
SGCN
Yahoo! Answers
BERT-ITPT-FiT
NewsDiscourse
R52
GraphStar
Yelp-5
HAHNN (CNN)
DODF Data
ULMFiT (pre-trained vocab, no gradual unfreezing)
Lot-insts
Character-BERT+RS
MVICTOR (type)
OneStopEnglish (Readability Assessment)
RoBERTa-RF-T1 hybrid
SVICTOR (type)
WeeBit (Readability Assessment)
BART-RF-T1 hybrid
Yelp-2
Amazon-2
arXiv-10
Protoformer
HateXplain
RCV1
NLP-Cap
ThreatGram 101 - Extreme Telegram Data
GPT-2
Amazon-5
BLURB
BioLinkBERT (large)
IMDb Movie Reviews
Logistic Regression
Overruling
Custom Legal-BERT
Sogou News
BERT-ITPT-FiT
Terms of Service
Twitter
An Amharic News Text classification Dataset
Naive Bayes using Tf-idf features
GLUE SST2
MuLD (Character Type)
Searchsnippets
Social media attributions of YouTube comments
SST-2
This is not a Dataset
TREC-50
20 Newsgroups
RoBERTaGCN
Adverse Drug Events (ADE) Corpus
AffCon 2020 Emotion Detection
Arxiv HEP-TH citation graph
BigBird
BANKING77
Facebook Media
FMC-MWO2KG
Flair
GLUE MRPC
GLUE RTE
Hyperpartisan News Detection
BigBird
Hyperpartisan
NICE-2
NICE-45
Patents
BigBird
RusAge: Corpus for Age-Based Text Classification
LSVC + linguistic features + publishing attributes
SILICONE Benchmark
STOPS-2
ERNIE 2.0
STOPS-41
TRAC2-Benghali. Task 2.
BERT
TRAC2-English. Task2.
TREC-10
BERT
Twitter Sentiment Analysis
Logistic Regression
Twitter-US
WNUT-2020 Task 2
NutCracker
ade_corpus_v2Ade_corpus_v2_classification
amazon_reviews_multi
book-text-classifier
catalonia_independence
clinc_oos
emotion
financial_phrasebank
GLUE
GLUE COLA
GLUE QQP
GLUE STSB
hate_speech18
IMDb
KLUE
MNIST
New_York_Times_Topics
NSFW-Safe-Dataset
SemEval 2014 Task 4 (Restaurants)
SST2
tecla
Unknown