Introduction
CosmicAC provides managed compute for Machine Learning (ML) workloads.
Infrastructure setup delays execution and diverts attention from model development. CosmicAC abstracts this setup, running and scaling jobs immediately, as needed, without manual server reconfiguration.
Job Types
CosmicAC supports several job types for different ML workflows.
GPU Container
Access high-performance GPU containers for training, experimentation, and development.
GPU containers let you:
-
Run on-demand GPU compute without managing infrastructure
-
Access GPU hardware directly through secure device plugins
-
Work in Virtual Machine (VM)-level isolated environments for secure, dedicated compute
-
Maintain full control over your environment: install packages, run scripts, and configure as needed
-
Comprehensive CLI commands for Job Management:
Quick reference common managed GPU CLI commands:
npx cosmicac jobs init npx cosmicac jobs create npx cosmicac jobs list npx cosmicac jobs shell <jobId> <containerId>
Managed Inference
Run inference on open-source models like Qwen through a managed API.
Managed Inference lets you:
-
Access open-source models without deploying or managing serving infrastructure
-
Interact with your model directly from the dashboard or from the CLI
-
Comprehensive CLI commands for Managed Inference:
Quick reference common Managed Inference CLI commands:
npx cosmicac inference init npx cosmicac inference list-models npx cosmicac inference chat --message "Explain quantum computing."
Continued Pretraining
Coming soon: Continued Pretraining
Extend base models on your own data for domain-specific tasks.
Continued Pretraining lets you:
- Train on your own datasets
- Save checkpoints at intervals during training
Why CosmicAC?
Minimal setup: Submit jobs via the CLI or web interface. CosmicAC provisions GPU resources and schedules your workload automatically, with no manual server requests or environment configuration.
Secure, isolated environments: Each workload runs inside a KubeVirt virtual machine, providing VM-level isolation while maintaining direct GPU access.
Fast provisioning: Start workloads in minutes, not days. CosmicAC replaces manual SLURM-based workflows with automated provisioning and scheduling.
Built-in inference serving: Deploy models instantly via the Managed Inference API. CosmicAC handles API key authentication, load balancing, and service discovery.
Real-time notifications: Receive email and push notifications when costs exceed thresholds or errors occur.
Who is CosmicAC for?
| Role | Use Case |
|---|---|
| ML engineers | Train models, run experiments |
| Data scientists | Deploy inference pipelines |
| Software engineers | Integrate inference API into applications |
| DevOps teams | Manage GPU infrastructure at scale |
Core Architecture
CosmicAC uses Kubernetes for orchestration and KubeVirt for secure workload isolation. Kubernetes schedules containers, allocates GPU resources, and manages job lifecycle. KubeVirt runs each workload in an isolated virtual machine without requiring privileged containers, applying standard Kubernetes security controls (RBAC, Security-enhanced Linux, and network policies) while exposing GPU devices through secure device plugins.
Kubernetes Implementation
CosmicAC uses Kubernetes as its core orchestration layer, replacing manual SLURM-based workflows with automated provisioning and scheduling.
| Before (SLURM) | After (Kubernetes) |
|---|---|
| Request servers manually | Submit jobs via CosmicAC |
| Configure SLURM | Provision infrastructure automatically |
| Set up the environment | Schedule containers automatically |
| Wait days for setup | Start workloads in minutes |
See System Components for detailed documentation of the architecture.
Next steps
Get Started
- Install and configure the CosmicAC CLI
- Create your first GPU container job
- Create an API key and connect to a managed inference service
Learn More
- Learn about GPU container jobs and direct GPU access
- Extend base models on your own data for domain-specific tasks
Create a GPU Container
Manage Inference
- Configure your API key and send inference requests
- Run open-source models without managing serving infrastructure