What you’ll be doing as a Site Reliability Engineer

  • Designing, automating, maintaining and monitoring our development environment and supporting infrastructure, applications and continuous integration
  • Maintaining Configuration Management using Puppet
  • Designing and supporting robust build deployment and configuration management systems for multi-tier Java applications
  • Working closely with the System Operations team establishing baseline architecture requirements and best practices, optimization, security, high availability and resource planning
  • Performing proactive application monitoring, configuring and maintaining a healthy system using industry standard monitoring tools
  • Utilize 'DevOps' principles and tools to improve systems and processes
  • Aiding in the troubleshooting efforts at the infrastructure, container and application layers
  • Logging and monitoring of critical log events

Requirements:

  • Linux Administration and Troubleshooting
  • Python development experience
  • Experience in any of the Configuration Management tools like Puppet (preferred), Ansible, Chef or Salt
  • Familiarity or understanding of Java
  • Attitude to thrive in a fun, fast-paced start-up like environment
  • Ability to excel at problem solving, adapt easily to change, contribute while working as a team or individually
  • Passion for DevOps, systems monitoring, automation and up-time

Apply for this Job

* Required