Senior Site Reliability Engineer
As industries race to embrace AI, traditional database solutions fall short of rising demands for versatility, performance, and affordability. Couchbase is leading the way with Capella, the developer data platform for critical applications in our AI world. By uniting transactional, analytical, mobile, and AI workloads into a seamless, fully managed solution, Couchbase empowers developers and enterprises to build and scale applications with unmatched flexibility, performance, and cost-efficiency—from cloud to edge. Trusted by over 30% of the Fortune 100, Couchbase is unlocking innovation, accelerating AI transformation, and redefining customer experiences. Come join our mission.
Job Description
- Site Reliability Engineers are hybrid software and systems engineers. They are the glue holding things together, whether that’s infrastructure, security and software or teams and processes.
- You will join Couchbase to lead the Cloud Platform & Production Pipeline Initiatives and work with our engineering team to optimize, implement and maintain our organization’s cloud-based systems.
- You will work with many software engineers and teams to ensure our cloud platform meets the needs of our organization and customers.
- You will set the strategy and operational KPIs for the platform team and the applications supported by the cloud organization.
- You will have an immediate impact on the day-to-day efficiency of cloud operations and an ongoing impact on growth.
Responsibilities
- Manage, monitor and maintain the infrastructure for our cloud service, Capella, to be running reliably
- Manage cloud environments in accordance with company security guidelines
- Collaborate with the Engineering teams to understand deployment practices and processes and work towards iteratively improving the CI/CD and release pipelines to ensure a highly resilient deployment strategy, ideally with zero downtime
- Stay up-to-date with new technologies and industry trends, and continuously improve the platform to meet the changing needs of the company
- Collaborate with development teams and application owners on the integration of security scanners in DevOps lifecycle, review their effectiveness and improve business cases to adapt to new threats.
- Take ownership of many controls, processes, and risks required to maintain our compliance portfolio (SOC 2, PCI-DSS, GDPR, and HIPAA, among others)
- Provide guidance (code reviews, architecture advice, technical feedback), thought leadership, and mentorship to development teams to improve service reliability, security, cost, and performance
- Demonstrate exceptional problem-solving skills, with an ability to identify and solve issues before they affect business productivity
Mandatory Requirements
- 5+ years experience in SRE/DevSecOps for a team operating on public cloud
- Proficiency with programming and scripting languages like Go, Python, Java, or Ruby
- We need a candidate who is good at writing Code, and the candidate will be interviewed on the basic DSA concepts
- High proficiency with Linux operating systems
- Experience in running, managing and maintaining Kubernetes clusters both self-managed (vanilla/plain k8s) & managed (preferably AWS EKS)
- Knowledge and understanding of Security topics such as vulnerability management, pen testing, SCA, DAST, SAST and Security tools such as Sysdig, Synk, Blackduck etc
- Proficient working with Terraform configuration management tools, version control systems (Git), integrating with CI/CD platforms and tool chains such as CircleCI, GitHub, Spinnaker etc
- Strong understanding of networking security concepts, including TCP/IP, DNS, HTTP, Firewalls, VPNs etc
- Deep working experience on cloud platforms and open source software like Artifactory, Jira, Jenkins, Grafana, Prometheus, Datadog, Thanos etc
Preferred skills and qualifications
- Knowledge of end-to-end availability (SLO/SLA), reliability, and performance concepts
- Experience with on-call rotations & incident management
- Proficiency with Databases such as Couchbase is a plus.
- Security certifications are appreciated
- Generous Time Off Program - Flexibility to care for you and your family
- Wellness Benefits - A variety of world class medical plans to choose from, along with dental, vision, life insurance, and employee assistance programs*
- Financial Planning - RSU equity program*, ESPP program*, Retirement program* and Business Travel Insurance
- Career Growth - Be valued, Create value approach
- Fun Perks - An ergonomic and comfortable in-office / WFH setup. Food & Snacks for in-office employees.
- And much more!
Apply for this job
*
indicates a required field