HashiCorp Boundary aims to provide a seamless, just-in-time remote access experience for customers to their infrastructure and other web applications without having to worry about passwords, certificates or other credentials. Boundary is offered as a Cloud platform and this role will be part of the Boundary Enterprise Enablement team whose primary focus will be scale and reliability to enable hypergrowth among medium and large enterprises.
What you’ll do (responsibilities)
As an engineer on the Boundary Product Reliability team, you will:
Develop a deep understanding on how customers use Boundary Cloud and enhance their experience through reliability
Drive service reliability by developing tooling that enables metric visibility using SLIs, SLOs, and SLAs
Champion incident management processes that directly impact customer experience
Reduce the operational overhead of HashiCorp Boundary product and leverage data to understand the largest source of reliability risk
Deploy, manage, monitor a large scale Boundary Cloud
Predict our future failures and work proactively to mitigate them
Have a passion for developer productivity to make other engineers lives better
Empowering engineers to troubleshoot their own issues by developing tools, frameworks and guardrails for safety
Partner with the broader HashiCorp organization to learn from incidents through a blameless postmortem process
Collaborate across teams to improve our tools based on experiences found from running our own software in production
Participate in a 24/7 on-call rotation that supports our production services
What you’ll need (basic qualifications)
5+ years of handling production applications at scale: Backend applications written in Golang, Databases, Observability, and AWS Primitives
Strive for quality through maintainable code and comprehensive testing from development to deployment
Clear communication skills while remaining empathetic and kind
An eagerness to learn through humility and reflection
Experience debugging performance bottlenecks for live services and database systems
Led or participated in incidents through incident management tools like incident.io, pagerduty, etc
What's nice to have (preferred qualifications)
Working knowledge of industry best practices related to information security
Working knowledge on AWS Aurora or postgres, Nomad or other orchestration platforms, Traefik or other load balancing technologies
Experience or willingness to conceive, document and advocate for best practices
#LI-Remote
Individual pay within the range will be determined based on job related-factors such as skills, experience, and education or training.
The base pay range for this role in the SF Bay Area / NYC area is:
$151,300—$178,000 USD
The base pay range for this role in California (excluding SF Bay Area), New York (excluding NYC), Seattle Metro, Denver / Boulder Metro, Washington D.C., or Maryland is:
$138,600—$163,100 USD
The base pay range for this role in Colorado (excluding Denver / Boulder Metro), Illinois, Minnesota, or Washington (excluding Seattle Metro) is: