Wikimedia logo
    W

    Senior Site Reliability Engineer

    Wikimedia
    Remote
    Remote
    Senior
    Full Time
    10 days ago
    💰$113,082 - $175,725
    remotesite_reliability_engineerSREDevOpsopen_sourcesenior

    Requirements

    • 6+ years experience in an SRE/Operations/DevOps role as part of a team
    • Experience with shell and scripting languages used in an SRE context (Python, Go, Bash, Ruby)
    • Experience with configuration management tools (Puppet, Ansible)
    • Experience with distributed caching systems including their algorithms and performance optimization
    • Experience with package management on Linux systems (Debian)
    • Strong Linux system-level troubleshooting skills
    • History of automating tasks and processes, identifying process gaps, and finding automation opportunities
    • Strong English language skills (verbal and written)
    • Ability to work independently and as part of a globally distributed team across multiple time zones
    • Experience leading and participating in incident response and post-incident review rituals with root cause analysis and preventive measures

    What You'll Do

    • Performing day-to-day operational/DevOps tasks on Wikimedia’s public facing infrastructure (deployment, maintenance, configuration, troubleshooting)
    • Implementing and utilizing configuration management and deployment tools (Puppet, Kubernetes)
    • Leading continuous improvement by automating the installation, configuration and maintenance of services on the platform
    • Working closely with product teams to assist in architectural design of new services and making them operate at scale
    • Participating in a 24/7 on-call rotation including incident response, diagnosis and follow-up on system outages or alerts
    • Collaborating with a global, cross-functional team in an asynchronous communication environment
    • Mentoring peers in areas of technical and operational strength

    Nice to Have

    • Experience with Linux kernel tuning
    • Experience with monitoring, metrics and logging infrastructure (Prometheus, Grafana)
    • Developing/contributing to Free and Open Source software or being part of an open-source community
    • Experience with LAMP stack technologies (PHP/HHVM, memcached/Redis), MediaWiki experience is a plus
    • Experience with defining cross-team SLOs and their implementation
    • Experience operating on-premise filesystem or object store at scale (OpenStack Swift or Ceph)
    • Experience with advanced distributed storage and database systems (Cassandra, MariaDB)

    Benefits

    • Remote-first role
    • Competitive and equitable salary based on skills, experience and location
    • Opportunity to work on a global top-10 website
    • Work in accordance with Wikimedia Foundation values
    • Travel 1-2 times a year for in-person events and team meetings

    About Wikimedia

    Wikimedia Foundation encourages the development and distribution of free educational content with projects such as Wikipedia.

    San Francisco, CA, US
    500 - 1000
    Media & Entertainment