Canada•United States
Remote
Senior
Full Time
4 days ago
remote-firststaff-levelinfrastructureplatform-engineeringGoTerraformGitOpsEKSAI-assisted operationson-call
Requirements
- •8+ years of professional, hands-on, full-time software engineering experience in backend, infrastructure, or platform engineering
- •Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience
- •Strong software engineering skills in Go or a similar language including design, testing, debugging, review, and long-term maintainability
- •Track record designing, shipping, and operating cloud services or infrastructure platforms in production
- •Deep expertise in at least one of Kubernetes, networking, cloud platforms, reliability engineering, or developer platforms, plus solid Linux, networking, and production-ops fundamentals
- •Experience setting technical direction and leading work that needs cross-team alignment
- •Clear written and verbal communication in a remote environment including RFCs, design docs, and incident writeups
What You'll Do
- •Take ambiguous infrastructure problems and turn them into proposals the org can rally around, then drive them through RFCs and architecture reviews across teams
- •Design self-service capabilities and platform APIs primarily in Go for onboarding, provisioning, deployment, observability defaults, and day-2 operations, with contracts and docs teams actually use
- •Set delivery standards using Terraform, GitOps with Argo CD, progressive rollout, and good testing, including building the continuous-deployment flow
- •Evolve the multi-tenant EKS foundations toward better reliability, security, scale, and cost including Envoy Gateway ingress, traffic routing, and multi-region, cross-account connectivity
- •Improve SLOs, alerting, and incident follow-up on Grafana Cloud to make production safer and less dependent on heroics
- •Shape AI-assisted operations workflows such as alert enrichment, incident context-gathering, runbook-assisted diagnosis and remediation recommendations, and onboarding assistants
- •Participate in on-call rotation and improve on-call health with better alerts, stronger runbooks, less toil, and blameless postmortems
Nice to Have
- •EKS and ingress/CNI/service-mesh experience
- •Observability with OpenTelemetry, Prometheus, Grafana
- •CI/CD and progressive delivery experience including GitHub Actions, Argo CD, canaries
- •Experience leading migrations or adoption programs across teams
Benefits
- •Freedom and flexibility to fit work around your life
- •Designated quarterly Whaleness Days plus end of year Whaleness break
- •Home office setup
- •16 weeks of paid parental leave after 6 months of employment
- •Technology stipend equivalent to $100 USD net/month
- •PTO plan that encourages taking time to do things you enjoy
- •Training stipend for conferences, courses and classes
- •Equity in the company
- •Docker swag
- •Medical benefits, retirement and holidays vary by country
- •Remote-first culture with offices in Seattle and Paris
