About the job

This role requires you to develop machine learning infrastructure components with a data-centric mindset to ease iteration, evaluation, and deployment of machine learning models. You partner with machine learning and backend engineers to deliver scalable, performant, highly available data ingestion and processing pipelines to drive business impact. Given you are constructing the foundation on which our global data infrastructure will be built, you need to pay close attention to detail and maintain a forward-thinking outlook as well as scrappiness for the present needs. You thrive in a fast-paced, iterative, but heavily test-driven development environment, with full ownership to design features from scratch to impact the business and the accountability that comes along.

Responsibilities:

  • Develop scalable and reliable machine learning infrastructure components to ease iteration, evaluation, and deployment of machine learning models
  • Design data ingestion and processing ETL pipelines for scale and validate their reliability
  • Design and evolve data models according to business and engineering needs
  • Collaborate closely with our machine learning and core backend teams to drive maximum impact across the organization
  • Participate in ensuring compliance with privacy by design principles (e.g., design data de-identification pipelines to develop systems that preserve customer privacy protections)
  • Follow and promote software engineering and data engineering best practices across the organization; keep up to date with the state of the art developments in data engineering open-source frameworks and MLOps
  • Shape the direction of data engineering at Relyance and build a cohesive team culture of ownership, growth, transparency, and customer focus

You are a good fit if you: 

  • Have a track record of delivering scalable and reliable machine learning infrastructure components and data ingestion and processing pipelines 
  • Are comfortable with Python
  • Are a strong believer in data-centric — as opposed to model-centric — MLOps
  • Have strong software engineering skills, and set examples by writing clear, concise, and maintainable code considering design principles and applying sound testing practices
  • Have experience in designing and evolving data models and ETL pipelines with job orchestration tools like Airflow or Apache Beam
  • Are proficient with public cloud concepts and delivering working solutions on public cloud infra, preferably GCP (BigQuery, BigTable, Pub/Sub)
  • Have a systematic and goal-directed approach to project management; are comfortable dealing with ambiguity and ruthlessly prioritizing and managing your time with a sense of urgency
  • Thrive in a self-directed environment with full ownership to design features from scratch to impact the business and the accountability that comes along
  • Are deeply curious, proactive about continuous improvement, and excited about learning at breakneck speed in a fast-growth environment; are eager to candidly and directly give and receive feedback to improve together as a team
  • Are customer and mission-driven, motivated by bringing the most value as possible to users and shaping an industry from the ground up
  • Are the ultimate team player: collaborate effectively with others, consistently make time to help your teammates, and are ego-less in the search for the best ideas

Apply for this Job

* Required