- Build, test and maintain data pipelines that support the data sciences team and customer-facing client teams.
- Track and manage pipeline efficiency and stability.
- Evaluate, parse, clean, and integrate raw data sets including third party APIs. Help build sophisticated ETL processes around first party data such as survey data, second and third-party data sources such as IP addresses, clickstream data, movie meta data.
- Provide recommendations for data storage, configurations, data access tools and new technologies/architectures.
- Develop code-based data transformation/aggregation in data lakes, relational databases (primary use cases) and possibly non-relational databases, as well as for the purpose of BI tools such as Power BI and Tableau.
- Participate in developing data APIs for data ingestion of NRG data into client-side applications or client-side data systems.
- Assist application developers in the effective use of database query and programming languages.
- Contribute to managing data integrity, data storage efficiency and data ecosystem efficiency.
Who You Are
- Team asset who can describe data structures, relationships, and flows behind organization database servers and applications.
- Key individual that the technology team looks to for advice related to data organization and planning.
- Up to date on the latest data-related best practices and technologies and always looking to learn more.
- Currently working as database engineer with good foundational experience.
- Internally motivated self-starter who continuously strives to get things done, regardless of challenges encountered.
- Critical thinker, able to understand and respond to complex questions or issues that may arise, and able to demonstrate willingness to experiment with new technologies.
- Successfully manages time and multiple competing priorities in order to ensure deadlines are always met.
- Team player who is able to work collaboratively and initiate and drive projects to completion with minimal oversight.
- 3+ years of experience using Spark for building data pipelines or ETL (Pyspark is preferred).
- 3+ years of experience in AWS technologies/infrastructure.
- 3+ years of experience in SQL and database technologies.
- PostgreSQL or AWS Redshift experience required.
- A solid foundation with end-to-end development and the desire to further their technical knowledge.
- Working knowledge with Python.
- Knowledge of PII (personally identifiable information) data security standards is a plus.
- Experience in databases structured against survey data is a big plus.
- Payment in USD.
- Free credentials for e-learning platforms.
- Remote workshops & activities.