Wikimedia logo
    W

    Senior Site Reliability Engineer, Infrastructure Foundations

    Wikimedia
    Remote
    Remote
    Senior
    Full Time
    29 days ago
    💰$113,082 - $175,725
    SeniorSite Reliability EngineerSREDevOpsRemoteOpen SourceInfrastructureLinuxPythonPuppet

    Requirements

    • 6+ years of experience in an SRE/Operations/DevOps role as part of a team
    • Experience with shell and scripting languages used in an SRE context (Python, Go, Bash, Ruby; primarily Python)
    • Experience with configuration management tools (Puppet, Ansible; primarily Puppet)
    • Experience designing and managing infrastructure security for large fleets of diverse services
    • Experience with technical response during security incidents
    • Experience with package management on Linux systems (Debian)
    • Strong Linux system-level troubleshooting skills
    • History of automating tasks and processes, identifying process gaps, and finding automation opportunities
    • Strong English language skills (verbal and written)
    • Ability to work independently and as part of a globally distributed team across multiple time zones
    • Experience leading and participating in incident response and post-incident review rituals with root cause analysis and preventive measures

    What You'll Do

    • Performing day-to-day operational/DevOps tasks on Wikimedia’s public facing infrastructure (deployment, maintenance, configuration, troubleshooting)
    • Implementing and utilizing configuration management and deployment tools (Puppet, Kubernetes)
    • Leading continuous improvement by automating the installation, configuration and maintenance of services on the platform
    • Working closely with product teams to help bring scalable functionality to users by assisting in architectural design of new services and making them operate at scale
    • Participating in a 24/7 on-call rotation including incident response, diagnosis and follow-up on system outages or alerts
    • Collaborating with a global, cross-functional team in an asynchronous communication environment
    • Mentoring peers in areas of technical and operational strength
    • Traveling 1-2 times a year for in-person events and team meetings

    Nice to Have

    • Experience setting and implementing fleet-wide security policies
    • Experience with software supply chain security
    • Awareness of the current open source infrastructure security landscape
    • Experience working with software security teams
    • Experience with credential management systems
    • Experience implementing immutable logging and auditing
    • Experience with monitoring, metrics and logging infrastructure (Prometheus, Grafana, etc.)
    • Developing/contributing to Free and Open Source software or being part of an open-source community
    • Experience with LAMP stack technologies (PHP/HHVM, memcached/Redis) and MediaWiki
    • Experience with defining cross-team SLOs and their implementation

    Benefits

    • Remote-first role
    • Competitive and equitable salary based on skills, experience and location
    • Inclusive and equitable workplace
    • Opportunity to work on a global top-10 website and open source software

    About Wikimedia

    Wikimedia Foundation encourages the development and distribution of free educational content with projects such as Wikipedia.

    San Francisco, CA, US
    500 - 1000
    Media & Entertainment