Installing MediaPipe on a Raspberry Pi and Implementing Gesture Recognition
Introduction
MediaPipe is an open-source framework developed by Google that provides powerful machine-learning solutions for computer vision, audio processing, and robotics applications. It supports real-time processing and is widely used for hand tracking, object detection, pose estimation, and more.
In this tutorial, we will cover:
- Installing MediaPipe on a Raspberry Pi.
- Running a basic gesture recognition example using MediaPipe’s Hand Tracking solution.
Prerequisites
Before installing MediaPipe, ensure that you have:
- A Raspberry Pi 4 (or newer) with Raspberry Pi OS (64-bit).
- A USB camera or Raspberry Pi Camera Module.
- Python 3.7+ installed.
- An active internet connection.
Step 1: Update and Upgrade Raspberry Pi
Before installing any new software, update your Raspberry Pi’s package list and upgrade existing packages:
sudo apt update && sudo apt upgrade -y
Step 2: Install Dependencies
MediaPipe requires several dependencies, including OpenCV and NumPy. Install them using:
sudo apt install python3-pip libatlas-base-dev
pip3 install numpy opencv-python mediapipe
If OpenCV is not installed properly, install it manually:
pip3 install opencv-contrib-python
Step 3: Verify MediaPipe Installation
Run the following Python command to verify that MediaPipe is installed correctly:
import mediapipe as mp
print(mp.__version__)
If no errors appear, MediaPipe is installed successfully.
Step 4: Implement Gesture Recognition
We will now use MediaPipe’s Hand Tracking module to recognize gestures.
1. Import Required Libraries
Create a new Python script, e.g., gesture_recognition.py
, and import the necessary modules:
import cv2
import mediapipe as mp
import time
mp_hands = mp.solutions.hands
mp_draw = mp.solutions.drawing_utils
2. Initialize the Camera
cap = cv2.VideoCapture(0)
with mp_hands.Hands(min_detection_confidence=0.7, min_tracking_confidence=0.7) as hands:
while cap.isOpened():
success, frame = cap.read()
if not success:
break
# Convert frame to RGB
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# Process the frame
results = hands.process(frame_rgb)
# Draw hand landmarks
if results.multi_hand_landmarks:
for hand_landmarks in results.multi_hand_landmarks:
mp_draw.draw_landmarks(frame, hand_landmarks, mp_hands.HAND_CONNECTIONS)
# Show the output
cv2.imshow("Gesture Recognition", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
3. Explanation of Code
- The script captures frames from the camera.
- Converts the frame from BGR to RGB (required by MediaPipe).
- Passes the frame through the MediaPipe Hands module.
- Draws detected hand landmarks on the video feed.
- Press ‘q’ to exit.
Step 5: Recognizing Specific Gestures
To recognize specific gestures (e.g., thumbs-up, open palm, etc.), analyze landmark positions:
if results.multi_hand_landmarks:
for hand_landmarks in results.multi_hand_landmarks:
index_finger_tip = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP]
thumb_tip = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_TIP]
# Example: Detect Thumbs-Up
if thumb_tip.y < hand_landmarks.landmark[mp_hands.HandLandmark.WRIST].y:
print("Thumbs Up Detected!")
Step 6: Running the Script
Run the Python script using:
python3 gesture_recognition.py
Conclusion
You have successfully installed MediaPipe on a Raspberry Pi and implemented a basic gesture recognition system using the Hand Tracking module. You can extend this by:
- Recognizing more complex gestures.
- Using the Face Mesh or Pose Estimation modules.
- Controlling devices (like LEDs or motors) based on gestures.
Happy coding with IoT and AI!