Remote
Remote
Senior
Full Time
29 days ago
💰$113,082 - $175,725
SeniorSite Reliability EngineerSREDevOpsRemoteOpen SourceInfrastructureLinuxPythonPuppet
Requirements
- •6+ years of experience in an SRE/Operations/DevOps role as part of a team
- •Experience with shell and scripting languages used in an SRE context (Python, Go, Bash, Ruby; primarily Python)
- •Experience with configuration management tools (Puppet, Ansible; primarily Puppet)
- •Experience designing and managing infrastructure security for large fleets of diverse services
- •Experience with technical response during security incidents
- •Experience with package management on Linux systems (Debian)
- •Strong Linux system-level troubleshooting skills
- •History of automating tasks and processes, identifying process gaps, and finding automation opportunities
- •Strong English language skills (verbal and written)
- •Ability to work independently and as part of a globally distributed team across multiple time zones
- •Experience leading and participating in incident response and post-incident review rituals with root cause analysis and preventive measures
What You'll Do
- •Performing day-to-day operational/DevOps tasks on Wikimedia’s public facing infrastructure (deployment, maintenance, configuration, troubleshooting)
- •Implementing and utilizing configuration management and deployment tools (Puppet, Kubernetes)
- •Leading continuous improvement by automating the installation, configuration and maintenance of services on the platform
- •Working closely with product teams to help bring scalable functionality to users by assisting in architectural design of new services and making them operate at scale
- •Participating in a 24/7 on-call rotation including incident response, diagnosis and follow-up on system outages or alerts
- •Collaborating with a global, cross-functional team in an asynchronous communication environment
- •Mentoring peers in areas of technical and operational strength
- •Traveling 1-2 times a year for in-person events and team meetings
Nice to Have
- •Experience setting and implementing fleet-wide security policies
- •Experience with software supply chain security
- •Awareness of the current open source infrastructure security landscape
- •Experience working with software security teams
- •Experience with credential management systems
- •Experience implementing immutable logging and auditing
- •Experience with monitoring, metrics and logging infrastructure (Prometheus, Grafana, etc.)
- •Developing/contributing to Free and Open Source software or being part of an open-source community
- •Experience with LAMP stack technologies (PHP/HHVM, memcached/Redis) and MediaWiki
- •Experience with defining cross-team SLOs and their implementation
Benefits
- •Remote-first role
- •Competitive and equitable salary based on skills, experience and location
- •Inclusive and equitable workplace
- •Opportunity to work on a global top-10 website and open source software
