Roostify is transforming the mortgage industry with an innovative and integrated platform that’s streamlining the entire digital lending experience. We believe that home lending should be a fair, fast, and transparent experience. Our software is used by banks across the country to improve lending experiences every day. We are a team of innovative thinkers on a mission to reinvent the lending experiences so people can accelerate their future.
Roostify is transforming the mortgage industry with an innovative and integrated platform that’s streamlining the entire digital lending experience. We believe that home lending should be a fair, fast, and transparent experience. Our software is used by banks across the country to improve lending experiences every day. We are a team of innovative thinkers on a mission to reinvent the lending experience so people can accelerate their future.
We are looking for a Senior Leader to head Site Reliability Engineering for the Roostify platform and suite of services. In this role, you will be joining the Client Services Delivery Team with the mission of delivering world-class services to our clients. As the SRE Lead you will be helping us continue to raise the bar for excellence in Production Operations.
Join us as a leader, with a strategic goal of establishing and independently running the Site Reliability function
The role is responsible for reliability and availability of all Production environments, their health, on-going monitoring, proactive and preventive health assessments.
Transform Operations & influence Engineering practices to achieve the strategic goal of new code deployed in Production frequently via Continuous Delivery (CD) pipelines
The role encompasses handling complex and varied product platforms and multiple Cloud deployment platforms.
Design the SRE function with the goal of providing 24x7x365 coverage
Build and evolve an Operations Model that can handle complexities spanning various cloud-based deployment models, and technology partner integrations.
Create & Support a delivery ecosystem that thrives on demonstrating value to stakeholders by adopting highly iterative & Continuous delivery models
Work with the product management team to define Service Level Agreements (SLAs) Service Level Objectives (SLOs) and implement Service Level Indicators (SLIs) for core capabilities
Collaborate with product and engineering to drive and improve the whole lifecycle of operational readiness - from inception to design, through deployment, operations, and proactive refinement
Influence Architectural and Product decisions with a bias towards Scale, Observability, Monitoring & Stability, and Security
Drive incident management process and support a blameless post-mortem culture
Own and drive high profile customer escalations
Drive and implement lean-ops culture by applying self-service, self-healing, and automation.
Advocate for SRE Principles, collaborate with all Engineering teams to create a DevOps mindset
Responsible for Capacity forecast, Budget & Cost optimization
Define and deliver KPIs, Metrics for Operations & Quality to stakeholders – Deployment Frequency, MTTR, Lead Time, etc.
Adopt and evolve internal processes based on industry best practices in SRE
Grow team members through career development through coaching and mentoring for junior engineers, foster leadership principles and behaviors to groom the next generation of leaders.
SKILLS & EXPERIENCE
Excellent academic background with a Bachelors’ Degree in Engineering
Minimum 10+ years of Software Engineering and/or Infrastructure Operations, 4+ years in SRE role
Ability to work with distributed, multicultural, and diverse teams
Expertise in deploying and supporting Micro-Service based applications, Containerization and Cloud Technologies
Experience with CI/CD tooling: Concourse, Jenkins, Azure DevOps, etc.
Proven experience troubleshooting complex and large cloud environments
Experience with designing, deploying, and maintaining monitoring solutions such as NewRelic, DataDog, Splunk, Prometheus, etc.
Developing, running, and/or consuming cloud technologies such as AWS, Azure, Google Cloud Platform, and related tooling: Terraform, configuration management, etc.
Experience with customer escalations and/or operations war room.
Strong understanding of modern monitoring and logging technologies
Strong analytical skills with a data-driven approach to solving problems
The ability to partner and influence product, engineering, and operations teams is a must
Strong organizational planning and development, business judgment, influential skills, and technical leadership
Experience with Agile methodologies – SCRUM, KANBAN, etc.
BENEFITS & PERKS
At Roostify we know that people do their best when they feel their best; we care about our people and want them to thrive. Here are some of the benefits we’re proud to offer:
Competitive Salary & Equity Packages
Health, Dental, and Vision Plans
Flexible Vacation Time
Roostify is an Equal Opportunity Employer
At Roostify we have a value of People First. We strive to provide the best experiences to our employees and candidates. We consider applicants without regards to race, color, national origin, sex, age, religion, sexual orientation, gender identity, veteran status, marital status, physical or mental disability, or other protected classes under all local, state, and federal laws and ordinances. Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
While Roostify HQ is located in San Francisco, CA, we are open to remote work within the USA for this role.