The Site Reliability Engineer (SRE) takes a multifaceted approach to ensure technical excellence and operational efficiency. The SRE plays a crucial role in driving the adoption of DevOps practices, helping clients transition from traditional approaches to more customer-focused and agile methodologies. By promoting shared responsibility and continuous improvement, the SRE fosters a culture of collaboration and technical excellence, enabling clients to achieve their business objectives.

Job responsibilities

  • Identify the KPIs that connects business and technical view in a holistic objective
  • Identify and lead implementation of practices and processes that will optimize quality, delivery efficiency and reduce technical debt for the client account.
  • Understanding the Service Level Objectives from the business perspective after discussing with the business team+PO, and thus being able to guide the SLIs and SLAs for SoW.
  • Ability to address  different audiences in a tailored approach  to achieve clear communication, and to provoke desired action when necessary. This includes skills such as Active Listening and Presentation Skills
  • Use learning, practicing and experimenting with tools, techniques and frameworks that foster collaborative working environments.
  • Advocate for excellence at Leadership levels Involved in overall direction of the product, providing strong support to other roles (Dev, PM, BA, QA) on the team.
  • Strongly follow Analytical and System thinking to drive towards stability. Ensure adherence to best practices across teams on an account and In depth knowledge of all areas of the project 
  • Up to date on latest trends and technology alternatives. Back technical choices and decisions with reasoning. Also present SRE accomplishments in project showcases. 
  • Define and evolve the guidelines for operational support practices, based on an understanding of the client operational environment. 
  • Effectively plan Infrastructure activities that align with functional and business requirements and share a roadmap with the client. Choose effective solutions that balance features, quality and effort/cost.

Job qualifications

Technical Skills

  • 8 plus year of relevant experience.
  • Should have Knowledge about infra architecture concepts like Clusters, High Availability, Deployment patterns like Blue/Green, Canary and Disaster Recovery etc.
  • Excellent understanding of Shell scripting/Python/Ruby/Groovy. 
  • Experience and In Depth knowledge on DevOps practices and Tools involved for CI/CD, Orchestration are must 
  • Aware of and able to analyze and design monitoring, logging, metrics, alerting and observability services.
  • Hands-on experience and proficiency in one or more of the Cloud Service Platforms like AWS, GCP or Azure
  • Understands the nuances around cloud migrations, private/public cloud choices.
  • Strong knowledge across the various platforms container orchestration tools/PaaS products/Server-less products (hands-on experience on one is a must, across various platforms is a bonus). 
  • Experience working with ITSM related tools like - ServiceNow, JIRA, PagerDuty
  • Experienced with troubleshooting, performance and hardening strategies. Aware of identity management and security across systems.

Professional Skills

  • Securing infrastructure and provisioning security services. Appropriately securing data at rest and in communications. Load Balancing, Network Security and understanding of standard networking protocols and configurations 
  • The ability to analyze, provision, configure, secure, troubleshoot, optimize and maintain networking. Bonus points if you have experience with unit testing and automated testing tools
  • Demonstrated experience in designing and implementing systems with high availability and resiliency in a production environment
  • Proven experience in conducting blameless postmortems for large-scale incidents and providing in-depth analysis in areas such as incident trends, mean time to recover, and any other area where improvements need to be identified and remediated
  • Ability to use data from Observability and monitoring tools to dissect and identify root causes of system and infrastructure issues
  • Strong communication, negotiation, and collaboration skills with the ability to convey complex, emergent problems in a way that allows for the root cause of the problem to be solved. Mentoring and guiding the team members
  • Significant stakeholder management responsibilities with key contacts across different layers. Exposure to managing production outages and remediations [i.e] RCA.
  • You will lead and mentor your teammates in upskilling, refining their technical skills as needed for them to scale into next roles.
  • Act as a thought leader—at client sites and at Thoughtworks—on DevOps, cloud, and infrastructure engineering. Establish trusting and thoughtful partnerships with a client’s CIO, CTO, and relevant teams.
  • Adjust and suggest innovative solutions to current constraints and business policies. Develop your career outside of the confinements of a traditional career path by focusing on what you’re passionate about rather than a predetermined one-size-fits-all plan.

Other things to know

Learning & Development

There is no one-size-fits-all career path at Thoughtworks: however you want to develop your career is entirely up to you. But we also balance autonomy with the strength of our cultivation culture. This means your career is supported by interactive tools, numerous development programs and teammates who want to help you grow. We see value in helping each other be our best and that extends to empowering our employees in their career journeys.

About Thoughtworks

Thoughtworks is a global technology consultancy that integrates strategy, design and engineering to drive digital innovation. For 30+ years, our clients have trusted our autonomous teams to build solutions that look past the obvious. Here, computer science grads come together with seasoned technologists, self-taught developers, midlife career changers and more to learn from and challenge each other. Career journeys flourish with the strength of our cultivation culture, which has won numerous awards around the world.

Join Thoughtworks and thrive. Together, our extra curiosity, innovation, passion and dedication overcomes ordinary.

#LI-Remote

Apply for this Job

* Required
resume chosen  
(File types: pdf, doc, docx, txt, rtf)
cover_letter chosen  
(File types: pdf, doc, docx, txt, rtf)


Our system has flagged this application as potentially being associated with bot traffic. Please turn off any VPNs, clear your browser cache and cookies, or try submitting your application in a different browser. If this issue persists, please reach out to our support team via our help center.
Please complete the reCAPTCHA above.