Pendo provides a product cloud that aims to improve the experience of software for thousands of customers and millions of their end users every day through analytics and data insights. Processing more than 10 billion events per day, Pendo services must operate resiliently and efficiently on a large scale.
Our platform is built on Google App Engine and Google Kubernetes Engine (GKE) and utilizes several Google technologies such as BigQuery, Memorystore, Cloud Datastore, PubSub, and Cloud Functions. We are looking for talented engineers to help keep our production environment running smoothly as we continue to grow.
As a Site Reliability Engineer you will...
- Write high-quality infrastructure-as-code that automates the provisioning, deployment, scaling, and monitoring of Pendo’s infrastructure to ensure that it is reliable and performant
- Write maintainable code for product functionality with a primary emphasis on operations, scale, resiliency, and monitoring
- Work with other engineers to ensure that new services are well-designed, properly monitored and have well-defined SLIs and achievable SLOs
- Debug production issues, learn to mitigate them quickly, and find ways to prevent them
- Maintain runbooks for manual tasks and replace those runbooks with automation whenever possible
- Proactively track our capacity, quotas, and other performance limits to plan for growth
- Participate in a 24x7 on-call rotation to handle product availability issues as well as urgent customer support escalations
Projects you might work on:
- Implement customer DNS configuration and functionality in the Pendo product
- Write custom scaling metrics for GKE
- Code infrastructure automation via Terraform for new PubSub topics and subscriptions
- Contribute to open source Terraform providers when new resources have not yet been implemented
- Develop migration plans for moving existing Google App Engine services to GKE
Qualifications (what you have):
- At least 3 years of experience as a DevOps Engineer and/or Site Reliability Engineer, preferably for a cloud product
- Ability to debug, optimize code, and automate routine tasks
- Experience designing, analyzing, and troubleshooting distributed systems
- Experience working with cloud infrastructure using tools such as Ansible or Terraform
- Strong programming skills in a language such as Go or Python, and a willingness to learn new languages as needed
- Ability to think and talk about systems in terms of possible failure modes, bottlenecks, etc.
- An urge to document all the things so the team doesn't need to learn the same thing twice
- Good number sense for discussing performance analysis, cost analysis, and operational metrics.
- A tendency to always leave things better than you found them
Pink, Perks, and Such:
Pendo was founded in 2013 by former product managers, who combined their heads and hearts to build something they wanted but never had as product managers -- a simple way to understand and attack what truly drives product success. Our mission is to improve society's experience with software.
Come join one of the fastest-growing startups, supported by best-in-class institutions like Battery Ventures, Salesforce Ventures, Spark Capital and Meritech. You will gain experience in a diverse and exciting set of technologies and clients and have a real impact on Pendo's future. Our culture is passionate, dynamic, and fun.
- Company Equity
- Health benefits 100% covered for your entire family. 100% dental and vision coverage for employee
- Open vacation policy
- Free weekly lunches and fully stocked kitchen with drinks, goodies and balanced snacks
- Frequent company and team-building events
- Free parking or monthly stipend for other modes of transportation (biking, walking, do you skate?)
- Lots of company swag...hope you like pink!
We are an equal opportunity employer and believe having diverse teams in which everyone brings their whole self to Pendo is key to our success. We welcome people of different backgrounds, experiences, abilities and perspectives.