🤖

Links

1 notes • IoT & AI

Cloud GPU Services for AI/ML Workloads

When a local GPU is not available, cloud GPU services offer on-demand access to powerful hardware for training and inference workloads.

Services

Modal — Serverless GPU compute with per-second billing. Good for inference endpoints and batch jobs. modal.com/pricing
RunPod — On-demand and spot GPU instances with persistent storage. Supports custom Docker images. runpod.io

Choosing a Service

Use Modal for event-driven inference (pay per call, no idle costs).
Use RunPod for long-running training jobs or when you need a persistent GPU environment.
For AWS-native workloads, consider EC2 p3/g4dn instances or Amazon SageMaker.

Notes

Compare spot/interruptible pricing for large training runs — significant cost savings are possible.
Ensure your Docker image or environment matches the CUDA version available on the chosen GPU.