Braze (formerly Appboy) is a customer engagement platform that delivers messaging experiences across push, email, apps, and more. Braze is built specifically for today’s mobile-first world and tomorrow’s ambient computing future. Braze is set apart as the platform that allows for real-time and continuous data streaming, replacing decades-old databases that aren’t built for today’s on-demand, always-connected customer. With data, technology, and teams working together in unison, the Braze platform makes marketing more authentic, brands more human, and customers more satisfied with every experience.
Each month, tens of billions of messages associated with over 1.5 billion active users are managed through our technology. Braze is a venture-backed company with hundreds of employees in offices located in New York City, San Francisco, London, and Singapore. Most recently, we’ve been named a Leader in the Forrester Wave™: Mobile Engagement Automation, Q3 2017 evaluation. We’ve been recognized by Forbes Cloud 100 at #85, ranked #225 on Inc.'s 500 Fastest Growing Private Companies, named a “Top 10 Upstart” by Business Insider, in addition to being #21 in the Deloitte Technology Fast 500 List. Learn more at Braze.com.
WHAT YOU'LL DO
The Site Reliability Engineering (SRE) team at Braze is the team that provides the guidance, expertise, mentorship and education to the entire Engineering operation on how to build, test, monitor and deploy massively scalable applications. As a member of this team, you will develop a profoundly fundamental understanding of how the applications you are responsible interact with the underlying infrastructure and how to translate that to more efficient scalable application code. You will also be defining what the standards are for "production" by working with Engineering teams to establish and implement testing frameworks for both application and the infrastructure they run on. These standards will be critical for defining the applications Service Level Objectives (internal and external) and meeting those objectives. To be successful on this team, you will need to be able to seamlessly go between system administration to writing the code that impacts the systems, with the goal of providing reliability and uptime at a massive scale.
The primary responsibilities of an SRE are to:
• Define and enable standards for configuration, monitoring, reliability, and performance
• Evolve services and educate engineers to create a culture of reliability and velocity
• Support and improve services from inception, through development and production by planning for scale and reliability
• Solve live performance and reliability issues and prevent their recurrence
• Scale services sustainably through the development of internal tools and automation
• Pair with other SRE / DevOps to plan for future capacity and infrastructure needs
• Practice sustainable incident response and blameless postmortems.
WHO YOU ARE
• Comfortable working in a highly collaborative environment
• Strong communication skills
• Interest in designing, optimizing and troubleshooting large-scale services
• Conviction and curiosity empowering a knack for troubleshooting hard problems
• Ability to learn rapidly in high stress situations and implement changes from those learnings
• Strong familiarity with containers and container-orchestration (Kubernetes, ECS, etc.)
• Experience using automation (Chef, Puppet, etc.) to make services more sustainable
• Experience in developing, debugging and optimizing code (Java, Python, Go, Perl or Ruby)
WHAT WE OFFER
Tech startup vibe including free daily lunches, snacks, and group events. Inclusive and diverse culture. Complete support from your teammates across all departments and a real “get it done” attitude for our customers. An opportunity to join a market leading company and have an impact.
- Excellent medical insurance and life insurance coverage for you and your dependents
- Matching 401K
- Tuition reimbursement program
- Daily lunches, snacks, and beverages
- Collaborative, transparent, collegial, and fun loving office culture
- Flexible time off policy to balance your work and life in the way that suits you best
In addition, this position is exempt under the provisions of the Fair Labor Standards Act.