Machine Learning Scientist
Who We Are
Flagship Pioneering conceives, resources, and builds companies across both human health and sustainability. Flagship has created over 100 scientific ventures resulting in >$200 billion in aggregate value, 500+ issued patents, and >50 clinical trials, spanning Moderna Therapeutics, Generate Biomedicines, Indigo Ag, Tessera Therapeutics, and others. We harness science and entrepreneurialism to envision alternative futures, beginning with seemingly unreasonable propositions and navigating to transformational outcomes through an iterative, evolutionary methodology. We call this process “pioneering”.
We are looking for extraordinary computational scientists, engineers, and entrepreneurs to work alongside individuals within the Flagship Ecosystem focused on solving the most impactful challenges in AI across both human health and sustainability. We collaborate, encourage failure, trust one another, and celebrate successful solutions to hard problems. We respect the diversity of opinion - because we value the freedom to explore hunches.
We believe deep integration of data-driven machine learning with experimental approaches will be a core driver of the next generation of defining companies in health. We aim to upend the traditional approach to molecular discovery towards one characterized by intentionality, programmability, and speed by developing methods for molecular design and generation that can reliably generalize across functions and applications. Modalities with potential across these applications span scientific areas across biology, chemistry, physics, and beyond; we believe that immense impact potential in human can come from diverse scientific and machine learning backgrounds – and are open to all profiles with computational excellence.
To this end, we are seeking creative, motivated Machine Learning Scientists to develop and apply our core technologies for ML-enabled molecular generation and ML-enabled Bayesian optimization, active learning, and experiment design in biology. You will join companies and explorations at the early stages of our company creation process to develop innovative methods for molecular generation and modeling, leveraging both in-house and external data to train and evaluate models while also deploying new algorithms into production and integrating deeply into experimental platforms. The successful candidate will work closely with experimental scientists to rapidly advance various scientific programs.
- Develop novel machine learning models and algorithms for data-driven molecular design and hone them through deployment on experimental platforms.
- Advance and evaluate the state of the art for machine learning models connecting molecular forms, features, structures, and function, spanning applications such as design of sequence-based molecules, structure prediction, complex prediction, and function learning.
- Develop, advance, and evaluate the state of the art for machine learning methods for developing surrogate models and acquisition functions, spanning Gaussian processes and functions like maximum probability of improvement and expected improvement, as well as approaches based on deep learning, variational models, as well as other areas of continual innovation in the field.
- Use our integrated data platform to devise models able to leverage measured labels “in-the-loop”.
- Work with experimental groups to tailor modeling efforts toward high-impact applications.
- Develop production-quality code in a team setting and plan for deploying and training models at scale.
- Present progress from scientific work in regular research meetings and prepare reports and slide decks for broader internal and external communication.
- PhD in a computer science, statistics, or a related field with demonstrated experience applying computational methods to scientific applications
- 3+ years of experience with developing Machine Learning methods to solve scientific problems, with a particular interest towards applications to molecular generation, active learning, Bayesian optimization and/or experimental design as well as adjacent fields such as biology, chemistry, immunology, or genomics
- Experience developing, debugging, and applying models using modern deep learning frameworks
- Foundational knowledge on Bayesian optimization and experimental planning methods including methods for uncertainty quantification and probabilistic modeling such as Gaussian processes, variational methods, MCMC techniques, and conformal prediction.
- Proficiency in Python and machine learning frameworks such as Tensorflow, Pytorch, and/or JAX
- Energetic self-starter with the ability to work effectively in an entrepreneurial environment
- Excellent analytical skills and ability to synthesize & communicate complex information rapidly and effectively
- Unmatched sense of urgency
- A deep passion for using novel machine learning techniques to unlock new impact potential across health
- No ego
Nice to have:
- Foundational knowledge around probabilistic machine learning and optimization methods
- Practical experience developing deep generative models (e.g., autoregressive models, VAEs, Flows, GANs, EBMs, Transformers, etc.)
- Publications in major ML conferences or scientific journals that apply ML to problems in the sciences, including but not limited to molecular biology, chemistry, physics, structural biology, genetics, or other key questions that center around molecular prediction, design, and/or generation
- Demonstrated experience developing software in a team setting
- Experience with optimizing performant code
Flagship Pioneering and our ecosystem companies are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status.