Senior Software Engineer - Site Reliability Engineering
Braze is a customer engagement platform that delivers messaging experiences across push, email, apps, and more. Braze is built specifically for today’s mobile-first world and tomorrow’s ambient computing future. Braze is set apart as the platform that allows for real-time and continuous data streaming, replacing decades-old databases that aren’t built for today’s on-demand, always-connected customer. With data, technology, and teams working together in unison, the Braze platform makes marketing more authentic, brands more human, and customers more satisfied with every experience.
Each month, tens of billions of messages associated with over 2 billion active users are managed through our technology. Braze is a venture-backed company with hundreds of employees in offices located in New York City, San Francisco, London, and Singapore. We’ve been recognized by Forbes Cloud 100 at #85, ranked #225 on Inc.'s 500 Fastest-Growing Private Companies, and listed as #21 in the Deloitte Technology Fast 500 List, and recognized by The New York Times as ‘The Next Wave of ‘Unicorn’ Start-Ups’. Learn more at Braze.com.
WHAT YOU'LL DO
Braze is at an inflection point in our maturity, where a key focus of our engineering work is on Scalability, Observability, and Reusability. The mission of the SRE team is to increase confidence in changes to the Braze production environment with a focus on performance and uptime for each service at Braze.
The Site Reliability Engineering (SRE) team at Braze is the team that provides the guidance, expertise, mentorship, and education to the entire Engineering organization on how to build, test, monitor and deploy massively scalable applications. The SRE team is the center of excellence for modern engineering operational best practices such as incident management, postmortems, technical debt management, and the culture champions for the development of clean, reusable, and scalable code. SREs are aligned with product engineering teams, know how the services their teams are responsible for function, and can work directly in the codebase.
The primary responsibilities of a Senior SRE are to:
- Lead and mentor junior engineers in SRE best practices, software engineering, and agile project leadership.
- Solve live performance and reliability issues and prevent their recurrence.
- Write and review code, educating engineers and building a culture of reliability.
- Practice sustainable incident response and blameless postmortems.
- Define and enable standards for monitoring, reliability, and performance.
- Bridge the gap between our infrastructure and platform engineering teams.
- Support and improve services by planning for scale and reliability.
WHO YOU ARE
- Motivated, proactive achiever, who believes in our FERRIC values.
- An experience leading projects end to end and mentoring junior engineers.
- Experience working in an SRE or DevOps Culture, with great communication and organizational skills and a proven ability to partner with other engineering teams.
- Experience implementing and overseeing observability:
- We use Jira, Git, Jenkins, ELK, Papertrails, Datadog, and Wavefront.
- However… It’s the mindset that matters, not the specific tools.
- Experience in developing, debugging, and optimizing code, at enterprise scale:
- We are a Ruby/Java/Golang shop but are language-agnostic!
- We use Redis/Sidekiq for queueing, and Mongo/Postgres for data storage.
- We use Git/Jenkins/Buildkite to build and deploy.
- Conviction and curiosity empowering a knack for troubleshooting hard problems
- Interest in designing, optimizing and troubleshooting large-scale services
WHAT WE OFFER
We have a Tech Startup vibe, so there are lots of opportunities to add value quickly, but with the maturity of a public company. We have an inclusive, supportive, and diverse culture. You’ll have complete support from your teammates across all departments and a real “get it done” attitude for our customers. As we get ready to be a public company, we have an organization-wide goal to improve reliability, so SRE is truly valued by the business, with a direct connection to business outcomes and sales goals.
- You’ll get to work on exciting and modern technology, at a truly market-leading scale. We send billions of messages a day, using industry-leading and cloud-native technologies.
- Excellent benefits
- Excellent medical and life insurance coverage for you and your dependents.
- 401k match, tuition reimbursement, and speaking grants.
- Free snacks/beverages, and daily catered lunches.
- Flexible time off, including unlimited vacation and remote work, to balance life and work.
- A collaborative, transparent, fun-loving office culture that’s a ‘best place to work’.
In addition, this position is exempt under the provisions of the Fair Labor Standards Act.