Car shopping is complicated. At CarGurus, we use data and technology to make it simple, giving people the tools they need to confidently find, buy, finance, or sell a car. The best part? Our work’s made a real impact. We’re the most-visited car-shopping site in the US, and we’re expanding globally. Ready to come along for the ride?
What You'll Do:
Our Site Reliability Engineering team applies industry-standard methodologies and principles to ensure service reliability and resiliency. We accomplish this by:
- Collaborating with Engineering and Product Managers to define SLOs and monitoring of well-designed SLIs
- Embedding with Engineering teams and independently addressing or collaborating to complete architectural improvements
- Being the primary point of escalation for major incidents involving assigned services
- Participating in an on-call rotation
- Owning our Incident Response Process, including conducting blameless Postmortems
- Improving robustness by automation of workflows, process improvements, CI/CD pipelines, and integrating modern toolsets
- Refusing to accept manual work as a solution to areas of weakness
- Partnering with Engineering teams to ensure new services are production-ready
- Championing our organizational standards for architecting, deploying, and scaling our products
- Making Data-Driven decisions to drive continuous improvement
- Evolving our tooling, logging, monitoring and alerting systems to increase observability and transparency
Who You Are:
- Proven background in software engineering with multiple languages and a firm belief in continuous testing and delivery, or significant relative operational experience running services at scale
- A bias for action, but sufficient emotional intelligence to approach colleagues with positive regard and understanding their challenges and decisions
- Curiosity and the acceptance that there are always ways to learn and grow
- The desire to be an active contributor in a collaborative and fast-paced environment
- Excitement in solving puzzles, discovering how a new service or tool works by identifying the individual components, libraries, and relationships it is built upon
- Understanding of technologies beyond coding such as Systems Engineering, Load Balancing, Configuration Management, Networking, Operating Systems, Troubleshooting, and Monitoring
- Comfort in dealing with Incidents and Availability Issues
- Familiarity with working with Cloud and Bare Metal infrastructure
- Exposure to industry standard observability tools and services
Technologies We Use
Terraform, Honeycomb, AWS, Prometheus, Java, Go, Ansible, Chef, Grafana, Docker, Kubernetes, Kafka, Elasticsearch, Sentry, Bazel, Concourse, Artifactory
At CarGurus, we invest in our people’s professional growth with everything from learning and development programs to tuition reimbursement. Want to work on projects that expand your skill set without sacrificing your work/life balance? You got it. We also strive to provide perks and benefits that employees actually care about like free lunch, commuter subsidies, and more. That includes equity in the company—our way of showing that we want you here for the long haul.
We work hard every day to build the world’s most trusted and transparent automotive marketplace, but trust and transparency don’t just apply to our consumers. They extend to our talent, too. We aim to create a workplace where everyone feels they can bring the ultimate expression of themselves and their potential—where you don’t just fit, you thrive. We don’t discriminate based on race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, veteran status, gender identity, or sexual orientation.
In addition to the US, CarGurus operates sites in Canada, Germany, Spain, Italy, and the UK—with other markets on the horizon. We have offices in Cambridge, MA; Detroit, MI; Dublin, Ireland; San Francisco, CA and London, UK. Check out our careers page to learn more.