PostHog logo
    P

    SRE - Infra

    PostHog
    Remote
    Remote
    Mid Level
    Full Time
    about 1 month ago
    SRESite Reliability EngineeringKubernetesAWSTerraformRemoteInfrastructureAutomation

    Requirements

    • Deep hands-on experience with Kubernetes in production (EKS preferred)
    • Experience debugging node pressure, networking issues, and deployment failures at scale
    • Strong experience operating production infrastructure on AWS including organizational boundaries, IAM, and networking
    • Experience automating infrastructure using Terraform or Terragrunt at scale including module design and state management
    • Solid understanding of Linux systems including disk, memory, networking, and failure modes
    • Experience supporting stateful systems such as databases, queues, and storage systems
    • Ability to debug and reason about performance and reliability issues in production
    • Comfortable owning systems end-to-end including on-call responsibilities

    What You'll Do

    • Own and operate production systems with deep ownership
    • Turn a fast-growing, stateful system into a predictable, well-automated platform
    • Provisioning, scaling, rebalancing, and recovery of infrastructure
    • Reduce operational stress by designing safe automation for traffic-heavy workloads
    • Build tooling and patterns to scale systems without scaling human effort
    • Operate EKS clusters with Karpenter autoscaling, Cilium networking, and ArgoCD-driven GitOps deployments
    • Manage and evolve a multi AWS account organization including provisioning, networking, access control, and cross-account connectivity
    • Maintain Terraform/Terragrunt IaC platform including modules and automated pipelines
    • Improve operational tooling around deploys, schema changes, backups, restores, and incident response
    • Reduce operational load by identifying and eliminating repeat pain points through code and automation
    • Optimize cloud spend
    • Participate in on-call and incident response with focus on reducing incidents

    Nice to Have

    • Experience with GitOps workflows (ArgoCD) and CI/CD pipelines (GitHub Actions)
    • Experience building AI agent-enabled base-level infrastructure services
    • Familiarity with multi-region infrastructure and consistency/availability tradeoffs

    Benefits

    • Remote work
    • Autonomy in choosing work and projects
    • Transparency in company operations and strategy
    • Product-led company with strong product-market fit
    • Well-funded with over $100m raised
    • Meeting-free days to prioritize building time
    • Ambitious and optimistic company culture
    • Supportive and professional team environment

    About PostHog

    PostHog is an open source platform that helps build successful products by providing tools to evaluate feature impact and customer value.

    San Francisco, California, United States
    101-250
    Developer Tools