We’re looking for an experienced data engineer to work on our search and discovery team. There are hundreds of millions of blogs and billions of posts to sort through, and we need someone who can help us surface the very best of what people are looking for, and the cool stuff they didn’t even know they wanted.
What you’ll do:
- Surface the best Tumblr content from 200+ Million blogs and 100+ Billion posts.
- Apply data mining and machine learning techniques to develop better search, recommendation and content discovery.
- Create tools, define metrics, and conduct experiments on new algorithms and approaches.
- Develop high performance distributed services and systems for search and discovery.
- Use map-reduce frameworks to generate production data for various search and discovery features.
What we’re looking for:
- Track record of solving big data problems.
- Knowledge about information retrieval, recommendation system, and relevancy.
- Experience of large datasets analysis with map-reduce stack such as Hive, Scalding, or Pig.
- Familiarity with Linux-based systems.
Tools we like:
- Java, PHP, Python
- Hadoop, Hive, Pig, Scalding
- Lucene, Solr