Get started
If you’re new to Runpod, start here to learn the essentials and deploy your first GPU.Quickstart
Create an account, deploy your first GPU Pod, and use it to execute code.
Concepts
Learn about the key concepts and terminology for the Runpod platform.
Create an API key
Create API keys to manage your access to Runpod resources.
Choose a workflow
Explore various methods for accessing and managing Runpod resources.
Serverless
Serverless provides pay-per-second computing with automatic scaling for production AI/ML apps. You only pay for actual compute time when your code runs, with no idle costs, making Serverless ideal for variable workloads and cost-efficient production deployments.Introduction
Learn how Serverless works and how to deploy pre-configured endpoints.
Pricing
Learn how Serverless billing works and how to optimize your costs.
vLLM quickstart
Deploy a large language model for text or image generation in minutes using vLLM.
Build your first worker
Build a custom worker and deploy it as a Serverless endpoint.
Pods
Pods give you dedicated GPU or CPU instances for containerized AI/ML workloads. Pods are billed by the minute and stay available as long as you keep them running, making them perfect for development, training, and workloads that need continuous access.Introduction
Understand the components of a Pod and options for configuration.
Pricing
Learn about Pod pricing options and how to optimize your costs.
Choose a Pod
Learn how to choose the right Pod for your workload.
Generate images with ComfyUI
Learn how to deploy a Pod with ComfyUI pre-installed and start generating images.
Public Endpoints
Public Endpoints provide instant API access to pre-deployed AI models for image, video, and text generation without zero setup. You only pay for what you generate, making it easy to integrate AI into your applications without managing infrastructure.Public Endpoints
Test and deploy production-ready AI models using Public Endpoints.
Model reference
Review the list of available models on Public Endpoints.
Instant Clusters
Instant Clusters deliver fully managed multi-node compute clusters for large-scale distributed workloads. With high-speed networking between nodes, you can run multi-node training, fine-tune large language models, and handle other tasks that require multiple GPUs working in parallel.Instant Clusters
Learn how instant clusters work and see deployment options.
Slurm Clusters
Deploy Slurm Clusters with zero configuration.
PyTorch Clusters
Run distributed PyTorch workloads on Instant Clusters.
Axolotl Clusters
Fine-tune LLMs using Axolotl on Instant Clusters.