Virta is the first company with a clinically-proven treatment to safely and sustainably reverse type 2 diabetes and other chronic metabolic diseases without the use of medications or surgery. Our innovations in nutritional biochemistry, data science and digital tools combined with our clinical expertise are shifting the diabetes treatment paradigm from management to reversal. Our mission - to reverse type 2 diabetes in 100 million people by 2025.
The Site Reliability Team consists of hybrid systems and software engineers who are responsible and take ownership for management of large scale infrastructure while improving reliability and automation. SREs are integrated with the rest of the software engineering team, and we're looking for engineers who want to be a part of developing infrastructure software, maintaining it, and scaling it.
Our approach to site reliability is that all engineering teams should share the operational responsibility of the systems they build, and the SRE team is responsible for building the platform that allows teams to do that.
- BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent technical experience
- Experience with algorithms, data structures, complexity analysis and software development
- Ability to diagnose technical problems, debug code, and automate routine tasks
- Analytical approach coupled with solid communication skills and a sense of ownership
- Interest in designing, analyzing and troubleshooting distributed systems.
- Experience negotiating SLIs, SLOs, and SLAs with product owners.
- Incident response and management experience.
- You will engage in and improve the whole lifecycle of software services—from inception and design, through deployment, operation and refinement.
- You will design the systems and processes that Virta engineers use to manage and deploy their software into production
- You will support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
- You will scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
- You will practice sustainable incident response and blameless postmortems.
- You are expected to bring a voice to the table and partner 100% with the rest of engineering, working together to come up with solutions that meet the needs of the business
- You will minimize risk of reliability related failure outcomes as pertaining to durability, availability, performance, and correctness
90 Day Plan
Within your first 90 days at Virta, we expect you will do the following:
- Review the architecture and set the technical direction for our logging, monitoring, and alerting infrastructure.
- Define the company’s culture and best-practices around incident management, blameless postmortems, and capacity planning.
- Set up an SLO framework and work with the rest of the product and engineering team to define SLOs for our critical services.
- Define our technical strategy for disaster recovery and high-availability.
- Define goals and set KPIs targets for the SRE team.
Virta’s company values drive our culture, so you’ll do well if:
- You put people first and take care of yourself, your peers, and our patients equally
- You have a strong sense of ownership and take initiative while empowering others to do the same
- You prioritize positive impact over busy work
- You have no ego and understand that everyone has something to bring to the table regardless of experience
- You appreciate transparency and promote trust and empowerment through open access of information
- You are evidence-based and prioritize data and science over seniority or dogma
- You take risks and rapidly iterate