7-8 Pose Estimation

Learning Objectives

Pose Estimation is a task in computer vision aimed at enabling a computer to understand the pose, joint positions, and motion structure of a person or object, rather than just knowing where it is.

For example, in human pose estimation, the model identifies in an image or video:

1. Keypoints

For example:

- Head, eyes, shoulders

- Elbows, wrists

- Hips, knees, ankles

2. Skeleton

Connects the keypoints according to the human body structure to form a skeleton

We will use Python and a pre-trained model to perform pose estimation.


Pose Estimation
Perform pose estimation on an image and print the results.

 

from ultralytics import YOLO

model = YOLO("yolo11n-pose.pt")

results = model.predict("https://ultralytics.com/images/bus.jpg")
for result in results:
    print(f'xy: {result.keypoints.xy}') # Keypoint data (Number of objects x 17 x 2)

You will then obtain the results.

Perform pose estimation using a webcam and display the rendered results.

import cv2
from ultralytics import YOLO

model = YOLO("yolo11n-pose.pt")

video_path = 0
cap = cv2.VideoCapture(video_path)

while cap.isOpened():
    success, frame = cap.read()
    if success:
        results = model.predict(frame)
        annotated_frame = results[0].plot()
        cv2.imshow("YOLO Inference", annotated_frame)

        if cv2.waitKey(1) & 0xFF == ord("q"):
            break
    else:
        break

cap.release()
cv2.destroyAllWindows()

 

This will give you the results.

Copyright © 2026 YUAN High-Tech Development Co., Ltd.
All rights reserved.