Director SRE, Head of Site Reliability Engineering

The Director of SRE role will be dedicated to review critical services design, architecture reviews, re-architect, set performance / reliability / availability benchmarks, tuning and owner to work with specific system team on fundamental design improvements, track incidents to close architecturally, and work with domain team to close functionally.

Key Responsibilities

 Serve as a hands-on leader and a cloud solution architect on Coupang’s Cloud Platform team.

  • 100% accountable for quality of solution architecture and design of system / Platform on availability, scalability and performance benchmarks.
  • Solves complex technical problems in our Cloud environment with expert judgement with strong ability to deep dive to understand current systems design and architecture.
  • Analyze complex distributed production deployments and recommend ways to optimize performance and/or automate processes by managing continuous integration servers, utilizing and enhance monitoring and testing tools.
  • Identify opportunities to make disruptive improvements in cloud services with high degree of systematic automation.
  • Possess expert knowledge in performance (millisecond latencies), scalability, availability (99.99% up-time), enterprise architecture best practices
  • Should be able to dive deep to come up with new solutions that helps solving systems bigger problems by simplifying system architecture and help increase productivity & efficiency or every Engineer & system in the company.
  • Build and make decision on accurate SRE tools, metrics to make the SRE team highly production and impactful.
  • Exert technical influence over multiple teams, increasing their productivity and effectiveness by sharing your deep knowledge and experience.
  • Strong problem-solving skills, analytical capabilities, and attention to detail
  • Strong cultural change management experience

Qualifications

  • Bachelor's degree and/or Master’s degree in Computer Science or equivalent.
  • 15+ years of Software engineering or Site Reliability Engineering experience
  • 5+ years of experience leading system design, architecture leveraging Cloud services
  • 5+ years of experience in building high-performance, highly available and scalable distributed systems in the cloud.
  • 3+ years of experience mentoring engineers to success.
  • Excellent cross-group collaboration, outstanding verbal and written communication.

Preferred

  • AWS Certification, Developing on AWS, and/or AWS Architect
  • 3+ years of hands-on experience as a Principal-level Software Engineer / Site Reliability Engineer
  • Excellent cross-group collaboration and communication skills
  • Demonstrated virtual team leadership capabilities

Coupang is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex or gender (including pregnancy, gender identity, gender expression, sexual orientation, transgender status), national origin, age, disability, medical condition, HIV/AIDS or Hepatitis C status, marital status, military or veteran status, use of a trained dog guide or service animal, political activities, affiliations, citizenship, or any other characteristic or class protected by the laws or regulations in the locations where we operate. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at usrecruiting@coupang.com.

 

Apply for this Job

* Required