7-8 Pose Estimation

Learning Objectives

Pose Estimation is a task in computer vision aimed at enabling a computer to understand the pose, joint positions, and motion structure of a person or object, rather than just knowing where it is.

For example, in human pose estimation, the model identifies in an image or video:

1. Keypoints

For example:

- Head, eyes, shoulders

- Elbows, wrists

- Hips, knees, ankles

2. Skeleton

Connects the keypoints according to the human body structure to form a skeleton

We will use Python and a pre-trained model to perform pose estimation.

Pose Estimation
Perform pose estimation on an image and print the results.

from ultralytics import YOLO
model = YOLO("yolo11n-pose.pt")
results = model.predict("https://ultralytics.com/images/bus.jpg")

for result in results:

    print(f'xy: {result.keypoints.xy}') # Keypoint data (Number of objects x 17 x 2)

You will then obtain the results.

Perform pose estimation using a webcam and display the rendered results.

import cv2

from ultralytics import YOLO
model = YOLO("yolo11n-pose.pt")
video_path = 0

cap = cv2.VideoCapture(video_path)
while cap.isOpened():

    success, frame = cap.read()

    if success:

        results = model.predict(frame)

        annotated_frame = results[0].plot()

        cv2.imshow("YOLO Inference", annotated_frame)
        if cv2.waitKey(1) & 0xFF == ord("q"):

            break

    else:

        break
cap.release()

cv2.destroyAllWindows()

This will give you the results.