At Netlify, we’re building a platform to empower digital designers and developers to build better, more elaborate web projects than ever before. We’re aiming to change the landscape of modern web development. Netlify currently serves more than 1,000,000 developers worldwide.
Netlify is a diverse group of incredible talent from all over the world. We’re ~44% woman or non-binary, and are composed of more than a fourth as many nationalities as we are team members.
We recently raised $63M in Series C funding to bring forward the next generation of tooling for a more accessible web. Among our investors are Andreessen Horowitz, Kleiner Perkins, EQT Ventures as well as the founders of GitHub, Slack, Figma and Yelp. This latest round brings Netlify’s funding raised in total to $107M to date.
About the Opportunity
The mission of our SRE team is to use a software engineering approach to architect, design, monitor and scale Netlify’s infrastructure for the next million users. We’re building out the foundation of reliability and observability to help provide global availability for our users. Our team is dedicated to ensuring application resiliency and delivering the compute and network platform at scale. We are a remote-first, globally distributed team and biased towards asynchronous planning and communication, meaning less meetings and more execution. We take documentation seriously and place our values of transparency, empowerment, and commitment at the forefront of everything we do. Beyond just hiring smart, empathetic team members, we foster a culture where there are no dumb questions and our team can get access to the resources that they need to continue to learn. As a remote-first company, diversity drives our identity. Whether you’re looking to launch a new career or grow an existing one, Netlify is the type of company where you can balance great work with great life.
As the manager of our SRE team, your focus will be on leading a team of senior engineers to help build and grow our next generation platform, which has critical infrastructure serving globally distributed storage and compute demands and will be scaling to handle our growth while meeting our expectations of high availability. Your team will be navigating the challenges and complexity of heterogeneous tools and environments, including multiple cloud platforms. You’ll be a hands-on manager designing, developing and delivering solutions that enhance the scalability, availability, and efficiency of our products. Our tech stack includes (but not limited to) Kubernetes, AWS, GCP, Kafka, and Golang based microservices. With our team, you’ll be looking for patterns and ways to increase efficiency, eliminate downtime, optimize costs, and maintain performance at scale.
What you’ll bring:
- Experience leading engineering teams responsible for 24x7 high volume, highly available systems
- Proficiency with one or more cloud providers and ability plan the growth of our infrastructure
- Experience with configuration management and continuous integration/delivery tools
- Experience setting strategic vision, owning and resolving issues that impact design, product success, or address future concepts, products, or technologies
- Experience working across teams to meet their goals for service reliability, availability, and efficiency
- Demonstrate a solid understanding of logging platforms and application performance metrics
- Passion for mentoring, nurturing, and growing a team of SREs
- Rich knowledge of databases and Unix environments
- Security and compliance experience (SOC, PCI, GDPR)
- History of managing globally distributed teams
- Located in North/South America hours (UTC -4 to -7)
Within 1 month, you’ll:
- You’ll begin the journey of understanding the complexities around our business, customer, and engineering needs. We believe strongly that it’s essential for you to take the time to become familiar with our space & how we operate!
- Have one-on-ones and pairing sessions with some of the people that you’ll be working closely with, including members of the Platform, Product, and leadership teams.
- Identifying opportunities for improvement and defining a roadmap of how to solve any gaps
- Learn from the team during weekly syncs
- Presenting first-time observations about the team, tools, processes, and growth opportunities
Within 3 months, you’ll:
- Establishing strong async communication rhythms with your peers and leaders, practicing transparency and visibility in your progress against areas of focus
- Gain a more robust understanding of the needs of the platform and become more comfortable with diagnosing issues
- Develop a long term roadmap for the team, product reliability and cloud infrastructure
- Drive multiple cross-team projects scaling our cloud infrastructure and building product observability and reliability
Within 6 months, you’ll:
- Elevate the work of the team and become a subject matter expert in the reliability roadmap for the product
- Introduce new frameworks and tools to help optimize and elevate the work of the team
- Demonstrate ability to organize around multiple ongoing streams of work
- Fortify relationships with cross functional players in your squad
- Work across teams to manage SLOs
- Participate in helping us grow the team by conducting interviews and partnering with leadership to strategize future hiring needs
- Ensure a sustainable team pace for the long haul, working on growth planning as needed
Within 12 months, you’ll:
- Have shaped how we view reliability here at Netlify and contributed to us becoming the leader in reliability
- Define a set of best practices on production readiness and implemented new path to production process
- Have attended a conference with our training budget to help expand your knowledge base
Of everything we've ever built at Netlify, we are most proud of our team.
We believe that empowered, engaged colleagues do their best work. We’ll be giving you the tools you need to succeed and looking to you for suggestions to improve not just in your daily job, but every aspect of building a company. Whether you work from our main office in San Francisco or you are a remote employee, we’ll be working together a lot—paring, collaborating, debating, and learning. We want you to succeed! About 60% of the company are remote across the globe, the rest are in our HQ in San Francisco.
To learn a bit more about our team and who we are, make sure to visit our about page.
Not sure you meet 100% of our qualifications? Please apply anyway!
When applying please include: A resume or short listing of your job history & skills. (A link to a LinkedIn profile would be fine). A cover letter explaining why you would enjoy working in this role and why you’d like to work at Netlify would be great, though not required & will not impact your application. When we receive your application we’ll get back to you about the next steps.
Netlify is an Equal Opportunity Employer. We are devoted to building a team of people with diverse backgrounds and lifestyles. We believe that the unique contributions of all Netlifolks is the driver of our success. We are all responsible for bringing on people from all walks of life. Driving equality empowers our team, enables us to innovate, and helps us maintain a more inclusive environment. We don’t discriminate against employees or applicants based on gender identity or expression, sexual orientation, religion, age, race, military/veteran status, citizenship, pregnancy status, or any other differences. If we can do anything to provide a better interview, i.e. accommodate a disability, then please let us know.