Help Shape the Future of Finance
Pagaya is a financial technology company working to reshape the lending marketplace, for investors, by using machine learning, big data analytics, and sophisticated AI-driven risk analysis. With its current focus on consumer credit and real assets, PAGAYA’s proprietary suite of solutions and pipelines to banks, fin-tech lenders and others was created to actively find greater value for institutional investors. PAGAYA’s models create additional value to that pipeline as well, by increasing liquidity and, in turn, increasing opportunities for access to credit.
We move fast and smart, identifying opportunities and building end-to-end solutions from AI models and unique data sources to new business partnerships and financial structures. Every PAGAYA team member is solving new challenges every day in a culture based on collaboration and community. We all make an impact regardless of title or position.
The company was founded in 2016 by seasoned finance and technology professionals, and we are now 400+ strong in New York, Tel Aviv, and LA. You will be surrounded by some of the most talented, supportive, smart, and kind leaders and teams—people you can be proud to work with!
- Continuous Learning: It’s okay to not know something yet, but have the desire to grow and improve.
- Win for all: We exist to make sure all participants in the system win, which in turn helps Pagaya win.
- Debate and commit: Share openly, question respectfully, and once a decision is made, commit to it fully.
Software is fundamental to research. From the humanities to physics, biology to archaeology, software plays a vital role in generating results. The Data Engineering group is a cross-functional team responsible for all data activities, including integration, monitoring, quality, and accessibility.
The Big Data Engineer will have responsibility for working on a variety of data projects. This includes orchestrating pipelines using modern Big Data tools/architectures as well as design and engineering of existing transactional processing systems.
- Build out and operate our foundational data infrastructure, including storage (cloud data warehouse, S3 data lake), orchestration (Airflow), and processing (Spark, Flink).
- Creates robust and automated pipelines to ingest and process structured and unstructured data from source systems into analytical platforms using batch and streaming mechanisms leveraging cloud-native toolset.
- Developing and maintaining data lake and data warehouse schematics, layouts, architectures, and relational/non-relational databases for data access and Advanced Analytics.
- Ability to conduct data profiling, cataloging, and data mapping for a technical design using a use case-based approach that drives the construction of technical data flows.
- Leverages the right tools for the right job to deliver testable, maintainable, and modern data solutions.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Work with other members of the data group, including data architects, data analysts, and data scientists.
- Use the tools and languages that are best suited to the job - Complete flexibility to problem-solving with novelty and creativity encouraged.
- Open source projects and frameworks recommended.
- Work with a team of highly motivated, bright, fun and creative people.
- Your intellectual curiosity and hard work contributions will be welcome to our culture of knowledge sharing, transparency, and shared fun and achievement.
- Contribute to our software engineering culture of writing correct, maintainable, elegant, and testable code.
- 3+ years' aggregated experience using Hadoop / Spark with Python .
- A data-oriented mindset.
- Experience using data tools and frameworks like Airflow, Spark, Flink, Hadoop, Presto, Hive, or Kafka.
- Experience with AWS cloud services: EC2, RDS, ECS, S3.
- Deep understanding of ETL, ELT, data ingestion/cleansing, and engineering skills.
- Building and designing large-scale applications.
- Strong analytic skills related to working with unstructured datasets.
- Willingness to get your hands dirty, understand a new problem deeply, and build things from scratch when they don't already exist.
- Undergraduate degree in Computer Science, Computer Engineering, or similar disciplines from rigorous academic institutions.
Any of the below would be an advantage:
- Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Operating systems, especially UNIX, Linux, and Mac OS.
- Experience supporting and working with cross-functional teams in a dynamic environment.