- Design and grow the Shopee data platform to support a variety of big data applications using open-source technologies including Kafka, Hadoop, Presto, HBase, Spark, Hive, Druid, and our own creations. Some examples include a real-time data streaming platform, a unified query platform, a cluster management system, and a machine learning platform
- Dig into the source code of some open-source big data system to get the whole control and familiar with the details, configurations, designs and source code. Develop and maintain the internal release of big data systems and components as the business requirements.
- Keep close and overall monitoring for all the deployments of the systems, maintain the system’s stability, improve the performance, discover the performance bottlenecks, tracking and troubleshooting, cost optimization.
- B. Sci. / Ms / PhD in Computer Science or a related technical field
- 2+ years of working experience in software development in at least one of these languages: Java, Scala, Python, C/C++, under Linux / Unix. Scala is a plus
- Familiar with the Big Data Infrastructure system technology like Distributed File System, Distributed Computing, Distributed Database
- Familiar with at least one of these system: Hadoop, Spark, Kafka, Presto, and other big data system
- Contributor/Committer/PMC member of some open-source big data system will be a plus
- Love to use and develop open-source technologies
- Excited to work intimately with data
- Passionate, self-motivated, and takes ownership