[As of June 2020, Quora has become a "remote-first" company. This position can be performed remotely from anywhere in the world, regardless of any location that might be specified above.]
The vast majority of human knowledge is still not on the internet. Most of it is trapped in the form of experience in people's heads, or buried in books and papers that only experts can access. More than a billion people use the internet, yet only a tiny fraction contribute their knowledge to it. We want to democratize access to knowledge of all kinds — from politics to painting, cooking to coding, etymology to experiences — so if someone out there knows something, anyone else can learn it. Our mission is to share and grow the world's knowledge, and we're building a world-class team to help us achieve this mission.
About the Team:
About the Role:
- Design, implement, maintain and optimize data pipelines, architectures and data sets.
- Collaborate with data scientists, platform engineers and business partners to understand data needs and drive key data infrastructure decisions.
- Bring your expertise to help model structured & unstructured data. Own these data models at a high level & be a data consultant for partner teams.
- Own the data definitions & lineage across different data platforms, maintain systems of record for operational and non operational data stores.
- Engineer reusable capabilities, abstractions & resilience in data pipelines for DML, DDL, ETL & Data flows which can be leveraged across teams.
- Be a data mentor & a team player with strong communication, prioritization, and adaptability skills.
- Ability to be available for meetings and impromptu communication during Quora's "coordination hours" (Mon-Fri: 9am-3pm Pacific Time). Members of our Infrastructure Engineering team are not required to work the full coordination hours, but should anticipate that they will need to be available Mon-Fri from either 11am-2pm PST or noon-3pm PST at minimum. Learn why here
- Proficiency in any/all of the programming languages: Python/Java/Scala & strong query authoring skills in SQL.
- Must have 2+ years of experience building data pipelines, including data ingestion, cleaning, processing, transforming, staging & loading.
- Proficiency with big data processing frameworks: Spark, Flink, Hive, Hadoop, Kafka, EMR, Presto.
- Operational mindset with ability to do Problem diagnosis, Root cause analysis, SLA compliance, Performance tuning and Incident Management in Data Infrastructure.
- Experience building data-intensive applications (high velocity/high volume).
- Experience with SQL/NoSQL data store & data lake operations.
- Flexible and positive team player with outstanding interpersonal skills.
- Passion for Quora's mission and goals.
- Hands-on experience with AWS technologies like S3, Redshift, EMR/EC2, Athena, Snowflake.
- Familiarity in designing and operating a streaming platform (eg. Kafka, Flink, Spark)
- Data wrangling & Data tooling ability
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.