About the role...

We are looking for a Data Scientist to join our Science organization and be a part of our growing machine learning effort. In this role, you will be a core developer of our deep learning toolkit focused primarily on protein sequence engineering using models from natural language processing. In this role you will be part of an interdisciplinary team of molecular geneticists, data scientists, and software engineers working to engineer protein targets for introduction in crop genomes. This person would be a core developer of our deep learning toolkit focused primarily on protein engineering and discovery. Preference will be given to those who can work in Cambridge, MA, with consideration given to candidates who will work remotely. If you have a desire to contribute to a world changing mission, Inari is the place for you!

As a Data Scientist, you will…

  • Be an early contributor to the young, but rapidly growing field of applying deep learning to biological sequences.
  • Adapt leading NLP model architectures to a new and data-rich domain, and develop new methods and benchmarks for validation.
  • Design and deploy generative models on public and in-house generated data sets to enable machine-assisted novel protein design. 
  • Keep up to date with NLP and deep learning research in order to proactively identify, assess, and internalize promising methods and tools.
  • Work in multidisciplinary teams  to advance the application of deep learning in novel areas, troubleshoot and develop new solutions 
  • Develop robust integrations with strategic third party tools, platforms and models
  • Work closely with our software engineering team to enable novel functionalities and scaling of approaches
  • Maintain detailed and organized records of your work, project data you generate and other information as needed
  • Participate in scientific discussions and present research outcomes to peers and management 

You bring… 

  • A Ph.D. or M.S. in computer science, engineering, statistics, mathematics, computational biology, data science, or related field
  • 2+ years of data science experience including working with neural networks applied to images, language, speech or biological sequence data
  • Extensive experience writing code and analyzing data in Python
  • A basic understanding of common bioinformatics tools and file formats or a strong desire to learn about them
  • Experience with machine learning libraries like TensorFlow and/or PyTorch
  • Desire to work in a mission driven organization focused on sustainability and how we grow food.
  • Interest in learning new technology or domains. We are an organization that spans many disciplines. 
  • A strong awareness of current deep learning literature and a willingness to test novel applications of these methods on biological data.
  • Ability to rapidly summarize data, communicate results, and act quickly and efficiently 
  • Ability to teach concepts or explain your work to a wide variety of audiences 
  • Ability to work in a fast-paced, cross-functional environment and handle ambiguity gracefully
  • Strong track record of developing creative solutions to complex problems
  • Strategic thinking, willingness to be bold and take risks
  • An efficient and well organized approach to deliver high quality results on time
  • A collaborative approach, open to giving and receiving ideas, perspectives and feedback
  • Strong communication skills, both written and oral

Bonus qualifications...

  • Previously worked with genomic data
  • Experience with container technologies: Docker, Kubernetes, Kubeflow
  • Experience working with large data tooling: Beam, Spark, Hadoop 
  • Experience with AWS tools: EC2, S3, Sagemaker
  • Knowledge of and enthusiasm for biophysics, biochemistry, and biotechnology  

Apply for this Job

* Required