Whip Media is transforming the global content licensing ecosystem with a market leading enterprise software platform that centrally connects data, processes and teams throughout the digital distribution journey. Powered by predictive insights and proprietary data, we enable the world’s top entertainment organizations to efficiently distribute, control and monetize their TV and movie content to drive revenue and direct-to-consumer growth.
We are looking for a principal data engineer to join our growing data engineering & machine learning team that is focused on building out our suite of enterprise products. This means leading projects that set up data ingests from external data partners, transforming machine learning prototypes into production-quality scoring APIs, and transforming our raw data into the enterprise data sets via ETL and APIs.
Our ideal candidate is someone who can drive high quality engineering practices and minimize technical debt/risk into the development of complex and emerging technologies. You have to love to play with the latest and greatest technologies yet still deliver reliable and efficient systems.
What will you do?
Lead engineering projects (including architecting and building) oriented around our enterprise data platform, including:
software furthering our master data management systems and entity resolution, including interfacing with machine learning teams and building HIL interfaces (30% of time)
feature generation pipelines and scoring APIs that realize machine learning projects from research to production. (30% of time)
data ingestion frameworks that expose data from external and internal data sources reliably to data scientists and enterprise products (30% of time)
Work with the data quality team to measure and alert on potential data quality issues
Leverage Spark, Kubernetes, Kafka, and other data technologies in architecting and building various core components of our enterprise data solutions.
Be a driver of quality engineering and engineering practices in the data engineering team (10% of time)
What do you need?
6+ years in formal engineering environments
4+ years working on large scale data projects, projects that productionize ML systems preferred
2+ years experience with Spark and/or the Hadoop ecosystem
Knowledge of streaming platforms like Kafka
3+ years of working in Python
Deep understanding about technical debt in data systems
Familiarity with data governance and data warehousing techniques preferred
BS or MS degree in Computer Science, Math, Statistics or a related technical field