We’re looking for a Senior DevOps Engineer who can build and maintain the infrastructure that powers our AI-driven automation tools. You’ll be responsible for ensuring our systems scale reliably, deploy smoothly, and perform consistently for enterprise customers who depend on us.
What You’ll Do
- Design and implement robust CI/CD pipelines that can handle rapid iteration cycles
- Manage and optimize cloud infrastructure across AWS, GCP, and Azure environments
- Build monitoring and alerting systems that catch issues before customers notice them
- Implement security best practices and ensure SOC 2 compliance
- Collaborate with AI engineers to optimize model deployment and inference infrastructure
- Create infrastructure as code using Terraform and similar tools
- Lead incident response and post-mortem processes to continuously improve reliability
- Mentor junior engineers and establish DevOps best practices across the organization
What We’re Looking For
- 5+ years of experience in DevOps, SRE, or infrastructure engineering roles
- Deep expertise with containerization (Docker, Kubernetes) and orchestration
- Strong experience with infrastructure as code (Terraform, CloudFormation, Pulumi)
- Proficiency in scripting languages (Python, Bash, Go)
- Experience with monitoring tools (Prometheus, Grafana, DataDog, New Relic)
- Track record of building and scaling production systems
- Understanding of security best practices and compliance requirements
- Experience with ML/AI infrastructure is a plus but not required
What Makes This Role Unique
You’ll be working at the intersection of traditional DevOps and cutting-edge AI systems. Our infrastructure needs to support everything from rapid prototyping with forward-deployed engineers to rock-solid production deployments for Fortune 500 companies. You’ll have significant autonomy to shape our infrastructure strategy and build systems that directly impact our ability to deliver value to customers.
Our Stack
- Cloud: AWS (primary), GCP, Azure
- Orchestration: Kubernetes, ECS
- CI/CD: GitHub Actions, ArgoCD
- IaC: Terraform, AWS CDK
- Monitoring: DataDog, Prometheus, Grafana
- Languages: Python, TypeScript, Go
Ready to Apply?
Reach out via our contact page and include the most challenging infrastructure project you’ve led and how you hardened it.