About the job

Lead ML infrastructure components with a data-centric mindset to ease iteration, evaluation, and deployment of machine learning models. The role requires you to think critically, design with first principles. You partner with machine learning and platform engineers to deliver scalable, performant, highly available data ingestion and processing pipelines to drive business impact. Given you are constructing the foundation on which our global data infrastructure will be built, you need to pay close attention to detail and maintain a forward-thinking outlook as well as scrappiness for the present needs. You are very comfortable learning new technologies, and systems. You thrive in a fast-paced, iterative, but heavily test-driven development environment, with full ownership to design features from scratch to impact the business and the accountability that comes along

Responsibilities:

  • Design, architect, manage and hire to lead Relyance AI’s data engineering and MLOps team
  • Develop scalable and reliable machine learning infrastructure components to ease iteration, evaluation, and deployment of machine learning models
  • Participate in ensuring compliance with privacy by design principles (e.g., design data de-identification pipelines to develop systems that preserve customer privacy protections)
  • Design data ingress, egress, and processing ETL pipelines for scale and validate their reliability. Design and evolve data models according to business and engineering needs
  • Design and develop several web scraping and data hygiene quality check data pipelines
  • Collaborate closely with our machine learning and core platform teams to drive maximum impact across the organization
  • Follow and promote software engineering and data engineering best practices across the organization; keep up to date with the state of the art developments in data engineering open-source frameworks and MLOps
  • Shape the direction of data engineering at Relyance and build a cohesive team culture of ownership, growth, transparency, and customer focus

You are a good fit if you: 

  • Have an exceptional track record of delivering scalable and reliable machine learning infrastructure components and data ingestion and processing pipelines (7+ years)
  • Are a strong believer in data-centric — as opposed to model-centric — MLOps
  • Have strong software engineering skills, and set examples by writing clear, concise, and maintainable code considering design principles and applying sound testing practices
  • Have experience in designing and evolving data models and ETL pipelines with job orchestration tools like Airflow, Prefect, or Apache Beam
  • Are proficient with public cloud concepts and delivering working solutions on public cloud infra, preferably GCP (BigQuery, BigTable, Pub/Sub)
  • Have a systematic and goal-directed approach to project management; are comfortable dealing with ambiguity and ruthlessly prioritizing and managing your time with a sense of urgency
  • Thrive in a self-directed environment with full ownership to design features from scratch to impact the business and the accountability that comes along
  • Are deeply curious, proactive about continuous improvement, and excited about learning at breakneck speed in a fast-growth environment; are eager to candidly and directly give and receive feedback to improve together as a team
  • Our customer and mission-driven, motivated by bringing the most value as possible to users and shaping an industry from the ground up
  • Are the ultimate team player: collaborate effectively with others, consistently make time to help your teammates, and are ego-less in the search for the best ideas

Apply for this Job

* Required