What does a Site Reliability do at Handshake?
Handshake is hiring our first Site Reliability Engineer! We're looking for someone who's passionate about reliability, performance, and productivity of web apps. Whether it's ensuring our numerous deploys per day go smoothly or making fundemental infrastructure improvements, Site Reliability Engineering is core to the engineering process at Handshake. As an early member of the team, you'll be making critical decisions that impact the productivity, reliability, and scalability of our entire platform.
Here are some projects we're excited for you to work on at Handshake.
- Taking a fresh look at our Heroku-based infrastructure and identifying the best path forward for our quickly growing user base and infrastructure needs.
- Making use of modern infrastructure technologies (such as containers, infrastructure as code, serverless), but not dismissing time-tested and proven technologies.
- Operating our automated continuous integration and continuous deployment infrastructure.
- Maintain and improve our elasticsearch, Memcache, Redis, and message queue systems.
- Implement monitoring, alerting, automated resolution, and runbooks for incident response.
What we look for
- You have experience architecting, implementing, and running cloud infrastructure.
- You are passionate and energized helping people of all types and backgrounds launch meaningful careers.
- You are proud of your craft, and enjoy clean code that scales to be both reliable, performant, and maintainable.
- You have a healthy appetite for automation, testing and building robust distributed systems.
- You care deeply about developer productivity, and actively working to improve it across the team.
- You love writing and communicating your ideas.
- You have experience with the build and release engineering cycles, as well as continuous integration environments.
Technologies and APIs you'll work with
- Heroku, AWS
- Ruby, Rails, Sidekiq, Puma
- PostgreSQL, Redis, ElasticSearch, Memcached
- Slack, Librato, NewRelic, Bugsnag, and PagerDuty
- Buildkite, Docker
- Stock: Sizable ownership in a fast-growing company.
- Family Focus: Parental leave (12 weeks maternity / 4 weeks paternity), and flexibility for families.
- Time Off: Flexible vacation policy to encourage people to get out and see the world.
- Healthcare: World-class medical, dental, and vision policies.
- Goodies: Whatever hardware and software you need to get the job done.
- Team Fun: Regularly scheduled events, sports, game nights, book clubs.
- Giving Back: Paid volunteer time.
- Learning: Sponsorship of meetup and conference attendance.
- Great team: Working with fun, hard working, nice people who are committed to making a difference!