Who are we?

Careem is the leading technology platform for the greater Middle East. A pioneer of the region’s ride-hailing economy, Careem is expanding services across its platform to include payments, delivery and mass transportation. Careem’s mission is to simplify and improve the lives of people and build a lasting institution that inspires. Established in July 2012, Careem operates in more than 130 cities across 16 countries and has created more than one million job opportunities in the region

We are on a mission to extend and build a much larger dream data warehouse team here at Careem and as a team motto, we truly believe in the famous “no brilliant jerks” policy a brief of which is outlined below:


On a dream team, there are no “brilliant jerks.” The cost to teamwork is just too high. Our view is that brilliant people are also capable of decent human interactions, and we insist upon that. When highly capable people work together in a collaborative context, they inspire each other to be more creative, more productive and ultimately more successful as a team than they could be as a collection of individuals.

About your new role

  • As a Data Warehouse Engineer at Careem, you'll be part of a team that builds and supports solutions to organize, process and visualize large amounts of data
  • You will be working with technologies such as Hive, Spark, Spark streaming, Kafka, Python, Redash, Presto, Tableau and many others to help Careem become a data-informed company
  • Some of the problems the team is working on are: Customer 360, ELT engineering, reporting infrastructure, data reliability, data discovery and access management
  • Build and manage ETL workflows to integrate data from many sources into a single source
  • Design a framework to ensure data is accessible, up-to-date, secure, consistent and complete
  • Work across teams to define organizational needs and ensure business requirements are met

You have:

  • 6+ years of experience with designing, building and maintaining scalable ETL pipelines
  • good understanding of data warehousing concepts and modeling techniques
  • Hands-on experience working with real time data processing using Kafka, Spark Streaming or similar technology
  • An understanding of Spark core/internals to read and understand spark catalyst plan and perform optimizations
  • Understanding of spark concepts like RDD, spark data frame, spark APIs
  • Knowledge of data modeling and schema design especially for distributed, column-oriented databases
  • Hands-on experience experience with workflow processing engines like Airflow, Luigi
  • Ability to dig deeper into the issues of the ETL pipelines, understand the business logic and provide permanent fixes

Good to have:

  •  Experience with CICD using Jenkins, Terraform or other related technologies
  •  Familiarity with Docker and Kubernetes

 

Apply for this Job

* Required