Volastra Therapeutics, Inc. is a biotechnology company based in New York City, formed from the laboratory of Dr. Lewis Cantley, along with Drs. Olivier Elemento and Samuel Bakhoum.  Volastra is dedicated to the discovery and development of treatments for patients with cancer.  The Company’s therapies will target novel pathways and apply insights gained from its scientific founders into chromosomal instability (CIN) and its clear association with the formation, progression, and maintenance of disease.  Leveraging these insights we will identify and quickly validate novel therapies to shift the treatment paradigm in the toughest to treat cancers.

Volastra has raised $45M in funding from top US and European venture firms, including Vida, Polaris, Droia, and Arch.  Current funding will look to advance a lead program into the clinic, advance another program to IND-enabling studies and strengthen the platform to identify a broader pipeline of candidates. The company has a world class scientific advisory board including two Nobel Laureates as well as a highly experienced senior leadership team.  This includes the CEO, CSO and senior leaders with expertise in chemistry, biochemistry, biology, immunology, and operations.

The company operates out of 15,000 square feet of office and state-of-the-art laboratory space in West Harlem, New York City, within easy traveling distance of Columbia, Cornell, Memorial Sloan Kettering, and all most other areas of the city and New Jersey.

Please visit for more information


The Research Associate/Senior Research Associate, Data Engineering will work closely with the Head of Data Science and the entire scientific team to achieve the company’s integrative data-centric approach to understand chromosomal instability in metastatic cancers.  If you enjoy the prospect of improving the lives of cancer patients using advanced computational methods, this may the role for you.

In your first 6 months, you will have the opportunity to materially impact the Volastra pipeline with data engineering support towards target identification, patient selection, and research prioritization.  We rely on both proprietary and external data from genomics, pre-clinical and clinical microscopy, chemoinformatics, proteomics, and more. We utilize cutting-edge data storage and access tools to operate in such a content-rich but high variance data ecosystem.

You’ll design unique data storage and access architectures, enabling our data science, biology, and chemistry teams to arrive at the best decisions faster.  You’ll be given the chance to explore novel techniques, including multi-modality data fusion, all while improving performance of our laboratory data generation systems.  We enjoy working with ever larger and complex data sets to improve human health, and hope you do too!

You will also:

  • Work collaboratively with the scientific team including biologists, computational biologists, data scientists, and chemists to answer complex questions in a data-driven manner.
  • Contribute to building a culture that embraces technical excellence and integrity with a sense of collaboration internally and externally.
  • Clearly communicate key findings to members of the scientific and leadership teams.

You have:

  • Experience identifying data bottlenecks, and a track record of architecting and implements solutions to them
  • Developed end-to-end batch and real-time data pipelines, and launched them into production
  • Expertise with cloud computing, preferably aws (EC2, s3, redshift, dynamo, neptune, etc.)
  • Developed new tools to help end-users consume necessary data more quickly
  • Maintain and troubleshoot existing data pipelines
  • Familiarity with novel data store solutions beyond SQL, such a key:value stores (BigTable, dynamo), and graph databases (neptune, neo4j)
  • 3+ years of previous work experience in data engineering roles, preferably related to biological, chemical, or medical data types
  • A passion for correct, reproducible, re-usable, and well-documented code, preferably in python, R, or similar language(s)
  • Some experience in analyzing multimodal datasets
  • Some experience with big data tools e.g. Hadoop, Spark
  • (Preferred) A graduate degree in a quantitative discipline (statistics, computer science, mathematics, engineering, etc)

Apply for this Job

* Required