Prefect is the new standard in dataflow automation. Our remote-first company is singularly focused on this vision, and every team member directly contributes to its advancement. Every role solves a problem, and everyone can see exactly how their work helps achieve our mission.
To that end, we've carefully created a positive, high-performance culture - the operating system of our company - that empowers our team to do the best work of their careers and achieve their personal and professional aspirations.
We are looking for folks who want to join a remote-first team to build an equally amazing company and product. In deciding whether to apply for a role at Prefect, consider whether your values align with our values and standards and check out our top-of-the-line benefits and perks.
As a Senior Platform Engineer, you will ensure observability and reliability of our externally-facing production infrastructure. You will collaborate closely with the product development and platform engineering teams to continuously improve system capacity and performance, building infrastructure and eliminating work through automation. We are a collaborative and blameless environment, focused on learning from incidents and improving resilience of our production systems.
Prefect’s culture of intellectual curiosity, problem solving, and collaboration will ensure that you have opportunities for professional growth in a supportive and blameless culture.
You will report to our Chief of Staff, Kingsley Blatter.
Expectations (you will):
- Proactively identify opportunities to improve the user experience, both for customers and internal stakeholders, such as other members of the platform engineering team, through projects covering performance engineering, adopting observability tools, and improving automation
- Establish strategies for maintaining a high quality of service, including identifying appropriate service-level indicators and defining suitable error budgets
- Develop a deep understanding of the boundary conditions of system behavior, designing strategies to prevent and mitigate failures
- This role may occasionally include participation in an on-call rotation. While this role includes participation in an on-call rotation, designed with an emphasis for sustainability
- Embody ownership of the reliability and resilience of our critical systems: contribute and advocate for your ideas, collaborating with stakeholders across the organization to influence practices that improve the operational experience
Qualifications (you have):
- Experience operating and optimizing large-scale distributed systems in production, including tools and techniques like observability and self-healing, to ensure that we can operate our systems reliably and in a sustainable way
- Expertise managing production services in cloud environments, such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP)
Bonus Points (we'd like you to have):
- Experience operating production systems in a high-growth startup environment
- Familiarity with declarative techniques for managing production infrastructure safely with modern Infrastructure as Code tools, such as Kubernetes and Terraform
- A thirst for excellence and passion for the craft of SRE, balanced with pragmatism
- Diverse experiences operating large-scale infrastructure in production
Sarah is a real live person (👋🏻) and is looking forward to learning more about you through your application.
Prefect is an equal opportunity employer and actively encourages applications from people of all backgrounds. All applicants will be considered for employment without attention to race, religion, color, sex (including pregnancy, sexual orientation and gender identity/expression), national origin, disability or any other status protected under applicable federal, state, or local laws.