"The front page of the internet,” Reddit brings over 330 million people together each month through their common interests, inviting them to share, vote, comment, and create across thousands of communities. Come for the cats, stay for the empathy.
Reddit is poised to rapidly innovate and grow like no other time in its history. This is a unique opportunity to leave your mark on one of the most influential and trafficked corners of the internet.
As a data engineer, you’ll build the exciting systems, services, and infrastructure tools needed to fuel the next wave of Reddit’s growth. You’ll bring your experience with complex distributed systems, passion for performance and optimization, and ability to write highly scalable / fault tolerant code to a team of top engineers.
With our small team, you work will directly impact hundreds of millions of users around the world. Join us and help build the future of Reddit!
Refining our data infrastructure technologies to support real-time analysis of millions of users.
Build analytics tools utilizing the data pipeline to provide actionable insights for our product and data science teams.
Consistently evolve data model & data schema based on business and engineering requirements.
Own the core company data pipeline and scale our data processing flow.
2+ years of experience building clean, maintainable, and well-tested code.
2+ years experience with large scale distributed real-time systems with tools such as Kafka, Flink, or Spark.
Expertise with Python and/or Scala.
Bonus points for experience with (or desire to learn) Kubernetes.
ETL design (both implementation and maintenance).
Excellent communication skills to collaborate with stakeholders in engineering, data science, and product.