Blockchain is the world's leading software platform for digital assets. Offering the largest production blockchain platform in the world, we share the passion to code, create, and ultimately build an open, accessible and fair financial future, one piece of software at a time.
We are looking for a Senior Site Reliability Engineer to join our engineering team as we tackle some of the most interesting problems in the crypto space, like how do we securely scale a distributed financial platform that touches millions of people a day.
Site Reliability Engineering (SRE) and DevOps is an engineering discipline that combines software and systems engineering to build and run large-scale, distributed and fault-tolerant systems. SRE ensures that Blockchain’s services are reliable and available to meet our users and the business needs, and delivers a fast rate of improvement while keeping an ever-watchful eye on capacity and performance.
At Blockchain, SRE is also a mindset and a set of engineering approaches to running better production systems—we build our own creative engineering solutions to operations problems. SREs are responsible for the big picture of how our systems are designed for operability, how they relate to each other, and we use a breadth of tools and approaches to solve a broad spectrum of problems. Practices such as limiting time spent on operational work, blameless postmortems and proactive identification of potential outages factor into iterative improvement are key to both product quality and interesting and dynamic day-to-day work.
The SRE/DevOps environment at Blockchain is a work in progress - we are looking for an experienced, senior SRE to provide engineering leadership across the SRE team and the broader engineering team. Are you ready for a challenge?
WHAT YOU WILL DO
- You will be able to play a critical role in evolving our infrastructure as we develop solutions to complex technical problems involving reliability, latency, bandwidth and most importantly security.
- You will focus heavily on writing tooling to replace manual, repetitive work in a scalable way.
- You will work in a fast paced, and dynamic environment complementing our existing high calibre team.
WHAT YOU WILL NEED
- Experience with containerization and service orchestration, including best practices and security. Experience with Hashicorp Nomad, Consul and Vault is a plus.
- Strong at automation in at least one programming language, preferably Python/Golang.
- Linux, including an understanding of resource allocation, network and/or internals.
- Solid background with configuration management tools.
- Experience with using GitOps and CI to make changes.
- Experience with infrastructure as code tools. Experience with complex terraform deployments is a plus.
- Experience with messaging systems such as Kafka.
- Experience with database management.
- Experience working in Data Centers is a plus.
- Knowledge of routing and switching protocols is a plus.
COMPENSATION & PERKS
- Unlimited vacation policy.
- Apple equipment.
- Full-time salary based on experience and meaningful equity in an industry-leading company.
- US Benefits: Medical, Dental, Vision, 401K, Flexible Spending Account, Commuter, Life, Short Term & Long Term Disability.
- LinkedIn profile.
- Link to github, stackoverflow, personal website and/or blog (if applicable).
- Favorite GIF