About the role:
We are looking for a Data Engineer, who will be responsible for creating and maintaining optimal data pipeline architecture
- Create and maintain optimal data pipeline architecture
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Work with stakeholders to assist with data-related technical issues and support their data infrastructure needs.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Responsible for POCs on any tech stacks introduced, or possible version upgrades.
- Perform root cause analysis on internal and external data to answer specific business questions and identify opportunities for improvement
- At least 5 to 6 years of experience as a data engineer.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
- Strong analytic skills related to working with unstructured datasets.
- Experience building and optimizing ‘big data’ data pipelines, architectures and data sets
- Experience in Apache Hadoop and its ecosystem having Apache Spark as big data paradigm.
- Must have experience in object-oriented/object function scripting languages like Java and Scala.
- Good knowledge of SQL especially in Hive/Impala/Drill.
- Experience in stream-processing systems.
- Working experience on Apache Nifi.
- Experience in Apache Hadoop and its ecosystem.
- Experience in big data tools like Spark, Kafka, etc.
- Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
- Experience in big data Cloud offerings like Redshift, Snowflake