About the role...
We are looking for a Data Scientist to join our Science organization and be a part of our growing machine learning effort. In this role, you will be a core developer of our deep learning toolkit focused primarily on gene regulatory sequence and protein sequence engineering. In this role you will be part of an interdisciplinary team of molecular geneticists, data scientists, and software engineers to prioritize targets for modification in crop genomes. The role is preferably based in Ghent, Belgium. If you have a desire to contribute to a world changing mission Inari is the place for you!
As a Data Scientist, you will…
- Contribute to the novel and rapidly growing field of applying deep learning to biological sequences
- Adapt industry-leading NLP model architectures to a new and data-rich domain, and develop new methods and benchmarks for validation
- Effectively utilize relevant public and proprietary databases to develop ML models to predict activity of regulatory sequences and design new synthetic variants.
- Keep up to date with NLP and deep learning research in order to proactively identify, assess, and internalize promising methods and tools
- Work with colleagues to troubleshoot and develop effective solutions when problems occur
- Develop robust integrations with strategic third party tools, platforms and models
- Proactively identify gaps and find solutions to improve the accuracy and efficiency of our data analyses
- Maintain detailed and organized records of your work, project data you generate and other information as needed
- Participate in scientific discussions and present research outcomes to peers and management
- A BS or MS in computer science, engineering, statistics, mathematics, computational biology or data science
- 2+ years of data science experience including working with neural networks applied to images, language, speech or biological sequence data
- Extensive experience writing code and analysing data in Python
- A basic understanding of common bioinformatics tools and file formats or a strong desire to learn about them.
- Experience with machine learning libraries like TensorFlow and/or PyTorch.
- Desire to work in a mission driven organization focused on sustainability and how we grow food
- Interest in learning new technology or domains. We are an organization that spans many disciplines
- A strong awareness of current deep learning literature and a willingness to test novel applications of these methods to biological data.
- Ability to rapidly summarize data, communicate results, and act quickly and efficiently
- Ability to teach concepts or explain your work to a wide variety of audiences
- Ability to work in a fast-paced, cross-functional environment and handle ambiguity gracefully
- Strong track record of developing creative solutions to complex problems
- Strategic thinking, willingness to be bold and take risks
- An efficient and well organized approach to deliver high quality results on time.
- A collaborative approach, open to giving and receiving ideas, perspectives and feedback
- Strong communication skills, both written and oral
- Previously worked with agricultural and genomic data
- Experience with container technologies: Docker, Kubernetes, Kubeflow
- Experience working with large data tooling: Beam, Spark, Hadoop
- Experience with AWS tools: EC2, S3, Sagemaker
- Knowledge of and enthusiasm for biophysics, biochemistry, and biotechnology