The mission of the Shopee Tech Ops MRE (Machine Reliability Engineering) team is to ensure efficient and sustainable operation of the Shopee network and hardware level 24x7, building and maintaining massive hardware clusters for SRE and capacity, in terms of capacity, cost and hardware performance. The team provides sustainable hardware resources and stable network support services. MRE needs to communicate with the data center team to design and optimise network architecture; provide reasonable hardware configuration through hardware testing and selection according to business requirements; customize stable and efficient OS; optimize traditional operation through engineering and service means; and build a complete hardware monitoring system to improve the efficiency of fault handling.
- Responsible for the maintenance of OS and server.
- Responsible for the system service such as NTP/SMTP/Ansible/Saltstack
- Responsible for the maintenance of CDN
- Responsible for the maintenance of CI/CD pipeline
- Provide efficient and effective OS/Server solutions according to business needs
- Bachelor’s or higher degree in Computer Science, Engineering, Information Systems or related fields;
- Experienced in Kernel tuning and customerisation;
- Proficient in Linux Operating system;
- Familiar with X86 hardware architecture;
- Skilled use of a variety of system management tools, with experience in performance benchmark, familiar with TCP/IP and basic network concept;
- Skilled in CI/CD pipeline, such as Jenkins/JenkinsX/Tekton/Gitlab CI/Harbor etc.
Skills below are optional but preferable:
- Well-versed in cloud native software lifecycle management and automation;
- Experience with OS Kernel fine-tuning and customization experience is preferred;
- Experience with Ansible/Saltstack;
- Experience with SMTP/PoP3/IMAP/NTP;
- Experience with development of CMDB