The Chan Zuckerberg Initiative was founded by Priscilla Chan and Mark Zuckerberg in 2015 to help solve some of society’s toughest challenges — from eradicating disease and improving education to addressing the needs of our local communities. Our mission is to build a more inclusive, just, and healthy future for everyone.
CZI supports the science and technology that will make it possible to cure, prevent, or manage all diseases by the end of this century. While this may seem like an audacious goal, in the last 100 years, biomedical science has made tremendous strides in understanding biological systems, advancing human health, and treating disease.
We build open source software tools and leverage artificial intelligence to accelerate the pace of biomedicine. We fund scientific research worldwide to advance the frontiers of knowledge. And we launched a family of institutes to do research that can’t be done in conventional environments. Each aspect is essential to our approach to building for the long term.
CZI’s work in science includes grantmaking programs, open-source software development, and close collaboration with its partner institutes at the Chan Zuckerberg Biohub Network. The CZ Biohub Network includes the San Francisco, Chicago, and New York Biohubs as well as the Chan Zuckerberg Imaging Institute. CZI also collaborates with institutional partners like the Kempner Institute for the Study of Natural & Artificial Intelligence at Harvard University. Join us in accelerating science.
The AI/ML team is funding and building one of the largest computing systems dedicated to nonprofit life science research in the world. This new effort will provide the scientific community with access to predictive models of healthy and diseased cells, which will lead to groundbreaking new discoveries that could help researchers cure, prevent, or manage all diseases by the end of this century.
The AI/ML and Data Engineering Infrastructure organization works on building shared tools and platforms to be used across all of the Chan Zuckerberg Initiative, partnering and supporting the work of a wide range of Research Scientists, Data Scientists, AI Research Scientists, as well as a broad range of Engineers focusing on Education and Science domain problems. Members of the shared infrastructure engineering team have an impact on all of CZI's initiatives by enabling the technology solutions used by other engineering teams at CZI to scale. A person in this role will build these technology solutions and help to cultivate a culture of shared best practices and knowledge around core engineering.
What You'll Do
- Provide technical leadership in designing and building efficient, stable, performant, scalable and secure AI/ML and Data engineering solutions.
- Actively take a hands-on approach to the architecture and build of our AI/ML compute infrastructure tooling - both as a Principal Engineer and System Architect. This includes design and coding across the platform for everything from streaming data systems based on Apache Kafka, our complex multi-modal Data Hub metadata management system we are building, and the containerized infrastructure we are building using Kubernetes in support of our various heterogeneous and distributed AI/ML environments.
- Provide insight and guidance for overall systems integration and architecture approaches for our containerized AI/ML platform components.
- Drive the technical design for how we extend our Cloud based AI/ML platform to successfully encompass our current Cloud based Databricks Ray on Spark, Weaviate Vector Databases, and Containerized Ray on Kubernetes with a workflow that extends to running pre-training, AI training, fine tuning, and inference in the hosted Cloud GPU Compute services.
- Provide technical leadership for the AI/ML and Data Engineering team in delivering and integrating our AI.ML platform with the various research teams across CZI, CZIF, and our CZ Biohub Network partners, as well as evangelize and educate our partners on best AI lifecycle practices for making use of our advanced AI platform tooling and Dataset curation systems as we collaborate with them.
- Guide cross-functional integration and approach to making optimal use of our shared infrastructure in empowering our AI/ML efforts with world class GPU Compute Cluster and other compute environments such as our AWS based services.
- Dive into deep stack complex coding challenges that you and the team will undertake in various areas such as scaling data engineering applications in support of LLM pre-training and in optimizing LLM code and workflows in support of AI tuning, training, and inference.
- Provide expertise across multiple engineering disciplines - Data Infrastructure, AI Compute Platform Infrastructure, Security Engineering, Data Governance, System/Data Compliance programs (SOC2 for example), and Scalable Containerization Platforms.
What You'll Bring
- BS or MS degree in Computer Science or a related technical discipline or equivalent experience
- 8+ years of relevant coding experience
- 8+ years of systems Architecture and Design experience, with a broad range of experience across Data, AI/ML, Core Infrastructure, and Security Engineering
- Proficiency with Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure, and experience with On-Prem and Colocation Service hosting environments
- Shown ability with a scripting language such as Python, PHP, or Ruby
- Proven coding ability with a systems language such as C, C++, C#, Go, Rust, Java, or Scala
- AI/ML Platform Operations experience in an environment with challenging data and systems platform challenges - including large scale Kafka and Spark deployments (or their coralaries such as Pulsar, Flink, and/or Ray) as well as Workflow scheduling tools such as Apache Airflow, Dagster, or Apache Beam
- Scaling containerized applications on Kubernetes, including expertise with creating custom containers using secure AMIs and continuous deployment systems that integrate with Kubernetes.
- Working knowledge of Nvidia CUDA and AI/ML custom libraries.
- Knowledge of Linux systems optimization and administration
- Deep understanding of Data Engineering, Data Governance, Data Infrastructure, and AI/ML execution platforms.
- HPC and Slurm experience a strong nice to have
The Redwood City, CA base pay range for this role is $241,000 - $362,000. New hires are typically hired into the lower portion of the range, enabling employee growth in the range over time. Actual placement in range is based on job-related skills and experience, as evaluated throughout the interview process. Pay ranges outside Redwood City are adjusted based on cost of labor in each respective geographical market. Your recruiter can share more about the specific pay range for your location during the hiring process.
Benefits for the Whole You
We’re thankful to have an incredible team behind our work. To honor their commitment, we offer a wide range of benefits to support the people who make all we do possible.
- CZI provides a generous 100% match on employee 401(k) contributions to support planning for the future.
- Annual funding for employees that can be used most meaningfully for them and their families, such as housing, student loan repayment, childcare, commuter costs, or other life needs.
- CZI Life of Service Gifts are awarded to employees to “live the mission” and support the causes closest to them.
- Paid time off to volunteer at an organization of your choice.
- Funding for select family-forming benefits.
- Relocation support for employees who need assistance moving to the Bay Area
- And more!
Commitment to Diversity
We believe that the strongest teams and best thinking are defined by the diversity of voices at the table. We are committed to fair treatment and equal access to opportunity for all CZI team members and to maintaining a workplace where everyone feels welcomed, respected, supported, and valued. Learn about our diversity, equity, and inclusion efforts.
If you’re interested in a role but your previous experience doesn’t perfectly align with each qualification in the job description, we still encourage you to apply as you may be the perfect fit for this or another role.