动作识别
动作识别是计算机视觉领域的一项任务,旨在通过视频或图像识别和分类人类行为。其目标是将视频或图像中执行的动作归类到预定义的动作类别中,以实现准确的动作检测与理解。该任务对于视频监控、人机交互、体育分析等应用场景具有重要价值。然而,构建大规模视频数据集的挑战使得现有的动作识别基准测试大多规模较小,通常只有约10k个视频。
Something-Something V2
MSNet-R50En (8+16 ensemble, ImageNet pretrained)
UCF101
ResNet50
HMDB-51
VideoMAE V2-g
Something-Something V1
InternVideo
AVA v2.2
VideoMAE (K700 pretrain+finetune, ViT-L, 16x4)
EPIC-KITCHENS-100
Avion (ViT-L)
NTU RGB+D
PoseC3D (RGB + Pose)
NTU RGB+D 120
PoseC3D (RGB + Pose)
Diving-48
ActivityNet
Text4Vis (w/ ViT-L)
AVA v2.1
H2O (2 Hands and Objects)
HandFormer-B/21x8
THUMOS’14
BMN
Sports-1M
ip-CSN-152 (RGB)
HACS
UniFormerV2-L
Charades-Ego
LaViLa (Finetuned, TimeSformer-L)
Volleyball
PoseC3D (Pose Only)
BAR
HAA500
UAV-Human
PMI Sampler
Animal Kingdom
Jester (Gesture Recognition)
DirecFormer
RareAct
Real Life Violence Situations Dataset
DeVTr
ICVL-4
IRD
miniSports
UCF-101
R3D-18
Drone-Action
Mimetics
JMRN
Okutama-Action
Penn Action
SL-Animals
SEW-Resnet18 (3sets)
ActionNet-VE
Charades
EgoGesture
EPIC-KITCHENS-55
HMDB51
MSQNet
UTD-MHAD
VIRAT Ground 2.0
DVS128 Gesture
Hockey
IndustReal
KTH
CNN-GRU
MECCANO
SlowFast
MTL-AQA
C3D-AVG
N-UCLA
DVANet
NEC Drone
RoCoG-v2
Skeleton-Mimetics
THUMOS14
UAV Human
FAR
UCF 101
R2+1D-BERT
UCFSports
Win-Fail Action Understanding
2DCNN+TRN