At Scalable, we build technology to accelerate ecommerce growth around the world. We are looking for people who are fast-moving and entrepreneurial, and who want to make an impact in the way ecommerce works.
Scalable Press is hiring a Site Reliability Engineer to join our production team committed to operational quality and scalability. If you like to be challenged and have a passion for solving complex operational problems at scale with automation, testing, and tuning, then we would love to hear from you. The ideal candidate is someone who exemplifies the ethics of “if you have to do something more than once, automate it,” and “do it right the first time.”
- Support availability and stability initiatives in our test, stage, and production environments
- Help design and implement a highly available infrastructure to meet the needs of our growing and evolving product
- Help measure and improve reliability and performance across the application stack.
- Drive continuous improvement by reducing the amount of manual operational work
- Coordinate with engineering to drive new technology to support our growth and applications
- 6+ years supporting production operations in a Linux/AWS environment with a modern application framework and highly-transactional database (Mongo, Postgres, MySQL)
- 4+ Experience with automating systems and infrastructure via Ansible, Puppet, or Chef, and or Terraform
- Understanding of basic networking (TCP/IP, DNS)
- Solid scripting abilities to support systems automation (Bash, Python, and Ansible)
- Production experience with a variety of monitoring and application performance management tools with the ability to dig in for root cause analysis.
- Experience with cloud services, AWS preferred
- English fluency
- BS in Computer Science or related discipline
- Have the ability to effectively communicate decisions, ideas, designs, and operation of systems and services in a clear and concise manner
- Have curiosity about how things work and love to share that knowledge with others
- Have a passion for helping others and making their lives better, you do this by simplifying complex systems to make them understandable and operable
- Team player - humble, hungry, and smart
- Project and issue management - able to break down complex projects into bite-size chunks