Remote
Remote
Senior
Full Time
10 days ago
💰$113,082 - $175,725
remotesite_reliability_engineerSREDevOpsopen_sourcesenior
Requirements
- •6+ years experience in an SRE/Operations/DevOps role as part of a team
- •Experience with shell and scripting languages used in an SRE context (Python, Go, Bash, Ruby)
- •Experience with configuration management tools (Puppet, Ansible)
- •Experience with distributed caching systems including their algorithms and performance optimization
- •Experience with package management on Linux systems (Debian)
- •Strong Linux system-level troubleshooting skills
- •History of automating tasks and processes, identifying process gaps, and finding automation opportunities
- •Strong English language skills (verbal and written)
- •Ability to work independently and as part of a globally distributed team across multiple time zones
- •Experience leading and participating in incident response and post-incident review rituals with root cause analysis and preventive measures
What You'll Do
- •Performing day-to-day operational/DevOps tasks on Wikimedia’s public facing infrastructure (deployment, maintenance, configuration, troubleshooting)
- •Implementing and utilizing configuration management and deployment tools (Puppet, Kubernetes)
- •Leading continuous improvement by automating the installation, configuration and maintenance of services on the platform
- •Working closely with product teams to assist in architectural design of new services and making them operate at scale
- •Participating in a 24/7 on-call rotation including incident response, diagnosis and follow-up on system outages or alerts
- •Collaborating with a global, cross-functional team in an asynchronous communication environment
- •Mentoring peers in areas of technical and operational strength
Nice to Have
- •Experience with Linux kernel tuning
- •Experience with monitoring, metrics and logging infrastructure (Prometheus, Grafana)
- •Developing/contributing to Free and Open Source software or being part of an open-source community
- •Experience with LAMP stack technologies (PHP/HHVM, memcached/Redis), MediaWiki experience is a plus
- •Experience with defining cross-team SLOs and their implementation
- •Experience operating on-premise filesystem or object store at scale (OpenStack Swift or Ceph)
- •Experience with advanced distributed storage and database systems (Cassandra, MariaDB)
Benefits
- •Remote-first role
- •Competitive and equitable salary based on skills, experience and location
- •Opportunity to work on a global top-10 website
- •Work in accordance with Wikimedia Foundation values
- •Travel 1-2 times a year for in-person events and team meetings
