<aside> ☝🏼 Here are some of the leading research efforts to solve the problem that Graphling is facing including pose estimation tasks, object recognition, and so on.

</aside>

📝Relevant Papers

💽Relevant Datasets/Github Repos/Models

🤼‍♀️BJJ Specific Datasets

https://vicos.si/resources/jiujitsu/
- Dataset for training jiu-jitsu position classification methods. 120,279 labeled images of 2 jiu-jitsu athletes sparring in different combat positions. The poses of the athletes were detected automatically. They were, however, manually verified and the best detected poses were selected. Nevertheless, the correctness of the poses is not guaranteed. The combat positions were labeled manually. The dataset includes images of 10 positions, resulting in 18 classes.

🤔Pose Estimation (https://paperswithcode.com/task/pose-estimation)

MVGFormer (Multiple View Geometry Transformers for 3D Human Pose Estimation)

![Untitled](<https://prod-files-secure.s3.us-west-2.amazonaws.com/d44d561a-392b-400e-b470-929c7ad30108/8c10d45f-8cf7-4ced-81ae-d87677d2e354/Untitled.png>)

[GitHub - XunshanMan/MVGFormer: This is the official implementation of the work presented at CVPR 2024, titled Multiple View Geometry Transformers for 3D Human Pose Estimation (MVGFormer).](<https://github.com/XunshanMan/MVGFormer>)

-

ViTPose (Simple Vision Transformer Baselines for Human Pose Estimation)

GitHub - ViTAE-Transformer/ViTPose: The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
- Paper: https://arxiv.org/abs/2204.12484
COCO (https://paperswithcode.com/dataset/coco)
- Large-scale object detection, segmentation, and captioning dataset. It contains over 330,000 images, including more than 200,000 labeled with key points for human pose estimation. The dataset is widely used for training and benchmarking pose estimation models due to its diverse and challenging set of images, capturing a variety of scenes, objects, and human activities.
AIC (https://paperswithcode.com/dataset/aic)
- Dataset from the AI Challenger competition, specifically focused on human keypoint detection. It includes a vast number of images with annotated keypoints for different human poses. The dataset aims to advance the research and development in human pose estimation by providing a comprehensive set of images in various environments and conditions.
Crowdpose (https://paperswithcode.com/dataset/crowdpose)
- Benchmark dataset designed to address the challenges of human pose estimation in crowded scenes. It consists of images with multiple people in close proximity, providing annotations for keypoints that are often occluded or overlapped. This dataset is particularly useful for developing and evaluating pose estimation models that need to perform well in densely populated scenarios.
MPII (http://human-pose.mpi-inf.mpg.de/#)
- Comprehensive dataset for human pose estimation, featuring around 25,000 images with over 40,000 annotated individuals. The annotations include detailed information about body joints and activities. MPII is known for its high-quality annotations and is widely used for training and testing pose estimation algorithms.
OCHuman (https://github.com/liruilong940607/OCHumanApi)
- Dataset focused on heavily occluded human poses. It contains images with annotations for human keypoints where individuals are often partially visible or overlapping with others. This dataset is designed to push the boundaries of pose estimation models to accurately predict keypoints even in challenging conditions of occlusion and crowding.
Human3.6M (https://paperswithcode.com/dataset/human3-6m)
- one of the largest motion capture datasets, which consists of 3.6 million human poses and corresponding images captured by a high-speed motion capture system. There are 4 high-resolution progressive scan cameras to acquire video data at 50 Hz. The dataset contains activities by 11 professional actors in 17 scenarios: discussion, smoking, taking photo, talking on the phone, etc., as well as provides accurate 3D joint positions and high-resolution videos.