You are to work with a cross-functional team, multi-location team, and a one of a kind project. With the developer, live ops, infrastructure, data science, and analytics team in the tech side, you will also work with business teams such as publishing to deliver the latest game content and best game experience to players worldwide. You will engage in and improve the whole lifecycle of services, from designing, analyzing to deployment and optimization. You are going to work on DevOps and building CI/CD pipelines. The Site Reliability Engineer is responsible for monitoring and dash-boarding for game observability and ensuring the game is reliable, scalable, and secure. The team is expanding right now, with many challenges and opportunities to unfold and excel at your full potential.
We have two tracks of the role with focus on one of the following:
Big Data Engineering
Live Ops Track Responsibility
Support services between development and operations by applying a software engineering mindset to system administration topics to measure and monitor availability, latency, and overall system health.
Allocate time to operations/on-call duties and developing systems and software that help increase site reliability and performance.
Build and operate cloud infrastructure environments and platforms
Occasional technical coach and mentorship session with junior engineers in the lab
Big Data Engineering Responsibility
Focused on the development and integration of the Data Platform.
Responsible for tracking, troubleshooting, and optimizing the implementation of challenging problems at the database level;
Responsible for the construction planning, development, and implementation of the corresponding supporting system for database management;
Responsible for the research and application of new technologies in the field of database storage.
Build components that glue the pipeline and integrate with 3rd parties and open source systems such as Spark and Hadoop.
Plan, develop, and maintain massively scaled systems that handle billions of messages each day.
BS. or advanced degrees in the fields of Computer Science or Software Engineering, or equivalent
3+ years’ experience of at least one primary cloud provider
3+ years’ experience in Linux or similar operation systems
3+ years’ experience at least one of the following languages: C, C++, Java, Python, or Go
Experience with containers and orchestration frameworks (Docker, Mesos)
Experience with source code management and version control (Git/GitHub/GitLab)
Experience with modern CI/CD tools and techniques
Solid understanding of networking concepts, technologies, and protocols (TCP/IP, IPSec, HTTP, FTP, DHCP, and DNS)
In addition above, for Big Data Track specifically:
Proficient in database configuration, backup, optimization, monitoring and fault diagnosis of MySQL, Redis, ES, HDFS and Kafka
Bilingual in Mandarin and English
Expertise in designing, analyzing, and troubleshooting large-scale distributed systems.
Problem-solving driven. Strong ability to debug, optimize code, and automate routine tasks.
Experience of operating worldwide scale games or production systems alongside monitoring and telemetry practices
The successful candidate will have demonstrable experience working in a multicultural environment.