PyTorch Mobile & ONNX Runtime: AI Models for Edge Devices
Introduction
The demand for running AI models on edge devices, including mobile phones, IoT devices, and embedded systems, has led to the development of optimized frameworks like PyTorch Mobile and ONNX Runtime. These frameworks enable efficient inference on resource-constrained devices, bringing machine learning capabilities closer to real-world applications.
This article explores PyTorch Mobile and ONNX Runtime, their architectures, benefits, and deployment strategies for mobile and edge AI.
PyTorch Mobile: Bringing PyTorch to Mobile Devices
PyTorch Mobile is an extension of PyTorch designed to run deep learning models on Android and iOS devices with optimized performance.
Key Features of PyTorch Mobile
- Optimized Model Execution: Reduces model size and improves latency.
- Supports Quantization: Uses INT8 representation to reduce memory footprint.
- Cross-Platform Support: Works on Android (Java/Kotlin) and iOS (Swift/Objective-C).
- Native Integration: Easily integrates with existing mobile applications.
- ONNX Export Compatibility: PyTorch models can be exported to ONNX for use in other runtimes.
PyTorch Mobile Workflow
- Train and Convert Model: Train the model in PyTorch and optimize it for mobile.
- Serialize the Model: Convert the model to TorchScript format.
- Deploy on Mobile: Integrate with Android/iOS applications for real-time inference.
Setting Up PyTorch Mobile
1. Convert and Optimize a PyTorch Model
Convert a trained PyTorch model to TorchScript format:
import torch
import torchvision.models as models
# Load a pre-trained model
model = models.mobilenet_v2(pretrained=True)
model.eval()
# Convert to TorchScript
scripted_model = torch.jit.script(model)
torch.jit.save(scripted_model, "mobilenet_v2.pt")
2. Deploy on Android
Add PyTorch Mobile dependency to build.gradle
:
dependencies {
implementation 'org.pytorch:pytorch_android:1.10.0'
implementation 'org.pytorch:pytorch_android_torchvision:1.10.0'
}
Load and run the model in Java:
Module module = Module.load(assetFilePath(this, "mobilenet_v2.pt"));
3. Deploy on iOS
Add PyTorch Mobile to Podfile
:
pod 'LibTorch-Lite'
Load and run the model in Swift:
let model = try? Module.load(filePath: "mobilenet_v2.pt")
ONNX Runtime: Cross-Platform AI Inference Optimization
ONNX Runtime (ONNX RT) is an optimized AI inference engine that enables running models on CPUs, GPUs, and mobile accelerators with high performance.
Key Features of ONNX Runtime
- Cross-Hardware Compatibility: Runs on x86, ARM, NVIDIA, AMD, and Intel processors.
- Optimized Execution: Uses graph optimizations, quantization, and model pruning.
- Multi-Platform Deployment: Supports Windows, Linux, macOS, iOS, and Android.
- Interoperability: Works with PyTorch, TensorFlow, Keras, and SciKit-Learn models.
- CUDA & DirectML Support: Enables GPU acceleration for high-speed inference.
ONNX Runtime Workflow
- Convert Model to ONNX: Export from PyTorch, TensorFlow, or other ML frameworks.
- Optimize Model: Apply ONNX Runtime optimizations.
- Deploy on Edge/Mobile Devices: Run inference efficiently using ONNX RT.
Setting Up ONNX Runtime
1. Convert a PyTorch Model to ONNX
Export a trained PyTorch model:
import torch
import torch.onnx
model = models.resnet18(pretrained=True)
dummy_input = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, dummy_input, "resnet18.onnx")
2. Run Inference with ONNX Runtime
Install ONNX Runtime:
pip install onnxruntime
Run the model using ONNX Runtime:
import onnxruntime as ort
import numpy as np
session = ort.InferenceSession("resnet18.onnx")
input_data = np.random.rand(1, 3, 224, 224).astype(np.float32)
output = session.run(None, {session.get_inputs()[0].name: input_data})
3. Deploy on Mobile (Android & iOS)
- Android: Use ONNX Runtime Mobile (
onnxruntime-android
) - iOS: Use ONNX Runtime CoreML or Metal API
Use Cases of PyTorch Mobile & ONNX Runtime
1. Computer Vision Applications
- Real-Time Object Detection: Face recognition, image classification, AR filters.
- Edge AI in Security Cameras: Intruder detection and anomaly recognition.
2. Healthcare & Wearable Devices
- Remote Patient Monitoring: AI-driven health analytics on smartwatches.
- Medical Image Processing: On-device diagnosis for X-rays and MRIs.
3. Industrial IoT & Smart Manufacturing
- Defect Detection in Factories: On-device AI for quality control.
- Predictive Maintenance: AI models deployed on factory sensors for failure prediction.
4. Autonomous Systems & Robotics
- Drones & Self-Driving Cars: Real-time AI inference for navigation.
- Smart Assistants & AI Bots: AI models for natural language processing (NLP) on embedded systems.
PyTorch Mobile vs. ONNX Runtime
Feature | PyTorch Mobile | ONNX Runtime |
---|---|---|
Model Format | TorchScript | ONNX |
Supported Platforms | Android, iOS | Windows, Linux, macOS, Android, iOS |
Hardware Acceleration | CPU, GPU | CPU, GPU, specialized accelerators |
Ease of Use | Best for PyTorch-based models | Works with multiple ML frameworks |
Deployment Type | Mobile-first | Edge, cloud, and mobile |
Conclusion
PyTorch Mobile and ONNX Runtime are leading solutions for deploying AI models on edge and mobile devices. PyTorch Mobile is tailored for native mobile applications, while ONNX Runtime offers cross-platform flexibility and high-performance inference on various hardware.
By leveraging these frameworks, developers can bring AI to mobile applications, edge IoT devices, and real-time embedded systems, enabling faster, more efficient, and cost-effective machine learning solutions.
Recommended: