GitLab's DevOps platform empowers 100,000+ organizations to deliver software faster and more efficiently. We are one of the world’s largest all-remote companies with 1,400+ team members and values that guide a culture where people embrace the belief that everyone can contribute.

Site Reliability Engineers (SREs) are responsible for keeping all user-facing services and other GitLab production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our our environments and the GitLab codebase. We specialize in systems, whether it be networking, the Linux kernel, or some more specific interest in scaling, algorithms, or distributed systems.

GitLab.com is a unique site and it brings unique challenges–it’s the biggest GitLab instance in existence. In fact, it’s one of the largest single-tenancy SaaS sites on the internet developed and ran completely transparently. GitLab.com runs using the same tools we provide to GitLab customers running self-managed installations. The experience of our team feeds back into other engineering groups within the company, as well as self-managed customers.

SRE's with Scalability specialization focus primarily on the application side of GitLab running on GitLab.com, through improving the architecture as GitLab.com continues growing. This is the main difference between Scalability SRE and other SRE’s at GitLab, your day to day will be spent inside of GitLab application and working directly with developers in the Scalability team to improve any application bottleneck that affects pre-defined SLO’s.

As an SRE for Scalability you will:

  • Be on a PagerDuty rotation to respond to GitLab.com availability incidents and provide support for service engineers with customer incidents.
  • Analyze existing, create and maintain new GitLab.com Service Level Objectives.
  • Troubleshoot, evaluate and resolve operational challenges contributing to defined SLO's.
  • Define, improve, and engage in adapting architectural application bottlenecks as observed on GitLab.com.
  • Work with other engineering stakeholders on resolving larger architectural bottlenecks and participate by offering GitLab.com point of view.
  • Work in close collaboration with software development teams to shape the future roadmap and establish strong operational readiness across teams.
  • Scale systems through automation, improving change velocity and reliability.
  • Leverage technical skills to partner with team members and be comfortable diving into a problem as needed.
  • Work with counterparts in other teams of the Infrastructure department to improve infrastructure running with Chef, Terraform and Kubernetes.
  • Make monitoring and alerting alert on symptoms and not on outages.
  • Document every action so your findings turn into repeatable actions–and then into automation.
  • Debug production issues across services and levels of the stack.

 

You may be a fit to this role if you:

  • Have strong programming skills as a (former) backend engineer - Preferably with Ruby and/or Go.
  • Are able to reason about large systems - how they work on large scale, edge cases, failure modes, behaviors.
  • Know your way around Linux and the Unix Shell.
  • Have experience in collaborating and communicating asynchronously.
  • Have an urge to document all the things so you don't need to learn the same thing twice.
  • Have an enthusiastic, go-for-it attitude. When you see something broken, you can't help but fix it.
  • Have a strong sense for action and know how to iterate through a problem quickly.
  • Share our values, and work in accordance with those values.
  • Have experience with Nginx, HAProxy, Docker, Kubernetes, Terraform, or similar technologies.
  • Are able to leverage GitLab as your day to day go-to tool.

Projects you could work on:

 

Senior Site Reliability Engineer Criteria

Technical:

  1. Deep knowledge in 2 areas of expertise and general knowledge of all areas of expertise. Capable of mentoring Junior in all areas and other SRE in their area of deep knowledge.
  2. Contributes small improvements to the GitLab codebase to resolve issues

Execution:

  1. Identifies significant projects that result in substantial cost savings or revenue
  2. Identifies changes for the product architecture from the reliability, performance and availability perspective with a data driven approach.
  3. Proactively work on the efficiency and capacity planning to set clear requirements and reduce the system resources usage to make GitLab cheaper to run for all our customers.
  4. Identify parts of the system that do not scale, provides immediate palliative measures and drives long term resolution of these incidents.
  5. Identify Service Level Indicators (SLIs) that will align the team to meet the availability and latency objectives.

Collaboration and Communication:

  1. Know a domain really well and radiate that knowledge
  2. Perform and run blameless RCAs on incidents and outages aggressively looking for answers that will prevent the incident from ever happening again.

Influence and Maturity:

  1. Lead Production SREs and Junior Production SREs by setting the example.
  2. Show ownership of a major part of the infrastructure.
  3. Trusted to de-escalate conflicts inside the team

Performance Indicators

Site Reliability Engineers have the following job-family performance indicators:

Compensation

To view the full job description and its compensation calculator, view our handbook. The compensation calculator can be found towards the bottom of the page.

Additional details about our process can be found on our hiring page.

For Colorado residents: The base salary range for this role’s listed level is currently $100,800 - $177,100 for Colorado residents only. Grade level and salary ranges are determined through interviews and a review of education, experience, knowledge, skills, abilities of the applicant, equity with other team members, and alignment with market data. See more information on our benefits and equity. Sales roles are also eligible for incentive pay targeted at up to 100% of the offered base salary. Disclosure as required by the Colorado Equal Pay for Equal Work Act, C.R.S. § 8-5-101 et seq.

 

Country Hiring Guidelines: GitLab hires new team members in countries around the world. All of our roles are remote, however some roles may carry specific location-based eligibility requirements. Our Talent Acquisition team can help answer any questions about location after starting the recruiting process.  

Privacy Policy: Please review our Recruitment Privacy Policy. Your privacy is important to us.

GitLab is proud to be an equal opportunity workplace and is an affirmative action employer. GitLab’s policies and practices relating to recruitment, employment, career development and advancement, promotion, and retirement are based solely on merit, regardless of race, color, religion, ancestry, sex (including pregnancy, lactation, sexual orientation, gender identity, or gender expression), national origin, age, citizenship, marital status, mental or physical disability, genetic information (including family medical history), discharge status from the military, protected veteran status (which includes disabled veterans, recently separated veterans, active duty wartime or campaign badge veterans, and Armed Forces service medal veterans), or any other basis protected by law. GitLab will not tolerate discrimination or harassment based on any of these characteristics. See also GitLab’s EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know during the recruiting process.

Apply for this Job

* Required
  
  


Voluntary Self-Identification

For government reporting purposes, we ask candidates to respond to the below self-identification survey. Completion of the form is entirely voluntary. Whatever your decision, it will not be considered in the hiring process or thereafter. Any information that you do provide will be recorded and maintained in a confidential file.

As set forth in GitLab’s Equal Employment Opportunity policy, we do not discriminate on the basis of any protected group status under any applicable law.

Race & Ethnicity Definitions

If you believe you belong to any of the categories of protected veterans listed below, please indicate by making the appropriate selection. As a government contractor subject to the Vietnam Era Veterans Readjustment Assistance Act (VEVRAA), we request this information in order to measure the effectiveness of the outreach and positive recruitment efforts we undertake pursuant to VEVRAA. Classification of protected categories is as follows:

A "disabled veteran" is one of the following: a veteran of the U.S. military, ground, naval or air service who is entitled to compensation (or who but for the receipt of military retired pay would be entitled to compensation) under laws administered by the Secretary of Veterans Affairs; or a person who was discharged or released from active duty because of a service-connected disability.

A "recently separated veteran" means any veteran during the three-year period beginning on the date of such veteran's discharge or release from active duty in the U.S. military, ground, naval, or air service.

An "active duty wartime or campaign badge veteran" means a veteran who served on active duty in the U.S. military, ground, naval or air service during a war, or in a campaign or expedition for which a campaign badge has been authorized under the laws administered by the Department of Defense.

An "Armed forces service medal veteran" means a veteran who, while serving on active duty in the U.S. military, ground, naval or air service, participated in a United States military operation for which an Armed Forces service medal was awarded pursuant to Executive Order 12985.


Form CC-305

OMB Control Number 1250-0005

Expires 05/31/2023

Voluntary Self-Identification of Disability

Why are you being asked to complete this form?

We are a federal contractor or subcontractor required by law to provide equal employment opportunity to qualified people with disabilities. We are also required to measure our progress toward having at least 7% of our workforce be individuals with disabilities. To do this, we must ask applicants and employees if they have a disability or have ever had a disability. Because a person may become disabled at any time, we ask all of our employees to update their information at least every five years.

Identifying yourself as an individual with a disability is voluntary, and we hope that you will choose to do so. Your answer will be maintained confidentially and not be seen by selecting officials or anyone else involved in making personnel decisions. Completing the form will not negatively impact you in any way, regardless of whether you have self-identified in the past. For more information about this form or the equal employment obligations of federal contractors under Section 503 of the Rehabilitation Act, visit the U.S. Department of Labor’s Office of Federal Contract Compliance Programs (OFCCP) website at www.dol.gov/ofccp.

How do you know if you have a disability?

You are considered to have a disability if you have a physical or mental impairment or medical condition that substantially limits a major life activity, or if you have a history or record of such an impairment or medical condition.

Disabilities include, but are not limited to:

  • Autism
  • Autoimmune disorder, for example, lupus, fibromyalgia, rheumatoid arthritis, or HIV/AIDS
  • Blind or low vision
  • Cancer
  • Cardiovascular or heart disease
  • Celiac disease
  • Cerebral palsy
  • Deaf or hard of hearing
  • Depression or anxiety
  • Diabetes
  • Epilepsy
  • Gastrointestinal disorders, for example, Crohn's Disease, or irritable bowel syndrome
  • Intellectual disability
  • Missing limbs or partially missing limbs
  • Nervous system condition for example, migraine headaches, Parkinson’s disease, or Multiple sclerosis (MS)
  • Psychiatric condition, for example, bipolar disorder, schizophrenia, PTSD, or major depression

1Section 503 of the Rehabilitation Act of 1973, as amended. For more information about this form or the equal employment obligations of Federal contractors, visit the U.S. Department of Labor's Office of Federal Contract Compliance Programs (OFCCP) website at www.dol.gov/ofccp.

PUBLIC BURDEN STATEMENT: According to the Paperwork Reduction Act of 1995 no persons are required to respond to a collection of information unless such collection displays a valid OMB control number. This survey should take about 5 minutes to complete.