Magic Leap is an eclectic group of visionaries, rocket scientists, wizards, and gurus from the fields of film, robotics, visualization, software, computing, and user experience. We are growing quickly, and this is the time to get on board and play a role in shaping the way people will be interacting with the world tomorrow.
As a Lead Site Reliability Engineer, you will work as part of our Ecosystem team to ensure the safe, swift, and reliable delivery of services to our customers. This role combines software and systems engineering to deliver highly scalable, distributed, fault tolerant systems. You enjoy creating solutions to operations problems. You have a holistic knowledge of our systems and services and can re-engineer processes when they need it and then clearly communicate the necessary change. You understand how various development teams operate and are able to reduce their effort to deliver new services. You have the grace to stay calm when production services are down and the courage to ask for help from the right people as needed to bring them back up. You enjoy collaborating with people from other teams and disciplines to make plans a reality.
Leads and mentors the other SRE team members and helps provide technical direction on complex engineering projects
Works with management to help break up engineering initiatives into smaller tasks for the SRE team
Develops solutions to increase service stability through automation and process re-engineering
Builds and supports tools and systems that other software engineers use to deploy their software into production
Participates in rotating on-call duties in a global, 24x7x365, team
Updates job knowledge by studying state-of-the-art tools and techniques; participating in educational opportunities; reading professional publications; maintaining personal networks; participating in professional organizations
Helps development teams operationalize their efforts to enable self-ownership of production services
Supports and develops colleagues by providing advice and coaching
10+ years’ experience working in a software engineering or development role
Sound fundamentals in UNIX based systems including proficiency with UNIX tools like SSH, grep, sed, awk, find, etc.
A solid understanding of networking and core Internet protocols (e.g. TCP/IP, DNS, TLS, SMTP, HTTP)
Strong programming skills in a modern language. Go, Java, Node.js, Ruby, etc.
Ability to script in a shell language (Bash or POSIX Shell)
Experience with public cloud providers (AWS, Google Cloud Platform, etc.)
Experience working with containers (Docker, Kubernetes, ECS, etc.)
Comfort with with frequent, incremental code testing and deployment