Who We Are 
Generate Biomedicines, Inc. is a Flagship backed, privately held biotechnology company on a mission to reimagine the drug discovery process to one of dynamic, data-driven generation. We pursue this audacious vision because we believe in the unique and revolutionary power of generative biology to radically transform the lives of billions, with an outsized opportunity for patients in need. Generate will be successful by constantly turning innovative ideas into methods, technologies, and products that solve some of the most difficult challenges with developing medicines. We are seeking collaborative, relentless problem solvers that share our passion for impact to join us! 

Generate was founded by Flagship Pioneering. Flagship Pioneering conceives, creates, resources, and develops first-in-category life sciences companies to transform human health and sustainability. Since its launch in 2000, the firm has applied a unique hypothesis-driven innovation process to originate and foster more than 100 scientific ventures, resulting in over $30 billion in aggregate value. The current Flagship ecosystem comprises 37 transformative companies, including: Moderna Therapeutics (NASDAQ: MRNA), Rubius Therapeutics (NASDAQ: RUBY), Indigo Agriculture, and Sana Biotechnology. 

Position Summary We are seeking a creative and motivated Data Engineer to help build the protein generation platform required to achieve our ambitious goals. She/he will work across the stack to develop, test, deploy, and maintain flows that fuel our protein generation efforts. The successful candidate will work closely with ML scientists, Computational Biologists, and Informatics/IT engineers to implement a scalable platform that rapidly advances our scientific programs.

Key responsibilities:

  • Design, develop and refine infrastructure for Generate Biomedicine’s design platform, enabling rapid prototyping, development, scale, and productionization of data analysis and modeling workflows.
  • Support and work with cross-functional teams in a dynamic environment to bring continuous improvement to engineering processes and tools.
  • Automate build processes, environment setups, testing scripts, and deployments for large-scale parallel compute.
  • Identify, design, and maintain the production of CI/CD and ETL pipelines to improve data reliability, efficiency and quality.


  • 2+ years of experience working in a DevOps or data engineer role using cloud-based infrastructure such as AWS, GCP, or Microsoft Azure
  • Proficiency in Python and strong object-oriented design skills coupled with a solid understanding of data structures and algorithms
  • Demonstrated experience managing scalable compute infrastructure (EKS/Kubernetes + Docker)
  • Demonstrated Experience with CI/CD tools, such as GitHub
  • Familiarity with workflow orchestration tools such as Airflow, Luigi, or Prefect
  • A self-started attitude and willingness to dive into complicated data engineering challenges
  • Ability to work in a fast-paced environment and strong technical communication skills
Flagship Pioneering and our ecosystem companies are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status.
Recruitment & Staffing Agencies: Flagship Pioneering and its affiliated Flagship Lab companies (collectively, “FSP”) do not accept unsolicited resumes from any source other than candidates. The submission of unsolicited resumes by recruitment or staffing agencies to FSP or its employees is strictly prohibited unless contacted directly by Flagship Pioneering’s internal Talent Acquisition team. Any resume submitted by an agency in the absence of a signed agreement will automatically become the property of FSP, and FSP will not owe any referral or other fees with respect thereto.

Apply for this Job

* Required
+ Add Another Employment