The Data Engineering Team Lead is responsible for leading the edX data engineering vision and practice. Empowered to shape edX’s data engineering strategy, you will be leading our data engineering team to architect and build out data pipelines and analytics infrastructure. You will collaborate closely with our data science and analytics team to prototype and deploy data-driven solutions to business problems. You will be doing a combination of leadership and hands on work. This is a critical role for edX and the person hired will have a major impact on our success.
Help edX find the right mix of talent, technology, and process that allows us to most effectively leverage our data to provide high quality education to everyone, everywhere.
During the first few months you will:
Establish strong working relationships with stakeholders as you build the vision for data engineering at edX
Provide hands-on leadership to a team of 2-4 software engineers to design and start building our next-generation data pipelines and analytics infrastructure. This is a management role in which you’ll also write code.
Lead the team in finding and implementing material process improvements
Use your excellent communication skills to collaborate across organizational boundaries, including providing guidance and support to engineers, data scientists, and other business stakeholders
Demonstrate a “you build it, you run it” mindset of ownership
Hold regular 1:1s with your direct reports in which you’ll provide coaching, mentoring, and empathy. You’ll help them define and follow through on their career growth and professional development needs.
Have regular, meaningful conversations with your own manager about your growth and career development
Have some fun!
A passion for data, and several years of hands-on experience leading data engineering projects, such as data architecture and data warehouse implementation projects
Strong focus on practical business outcomes
Experience leading a team of engineers
Experience building data pipelines and ETLs using distributed processing and streaming data tools, such as Hadoop, Spark, Storm, or Kafka
Working knowledge of SQL
Experience designing and implementing multi-terabyte data warehouses in MPP database systems, such as Vertica, RedShift, or BigQuery
Skills in Python or a similar language (we use Python)
Nice to have
Experience deploying machine learning algorithms in production
Experience with clickstream data or learning analytics (xAPI, Caliper)
Experience using business intelligence tools such as Tableau or Power BI to enable consumption of data across an organization