Petuum’s mission is to unlock human productivity and well-being by advancing the limits of AI technology standards and engineering to build trustworthy AI products. The Petuum team is looking for talented, motivated full-time DevOps Engineer who has passion for Cloud and DevOps technologies.

You are an experienced DevOps engineer with minimum 5 years of Cloud and DevOps experience, you will manage hardware and computing resource (both on premises and in cloud) for development and own the continuous integration, delivery and deployment (CI/CD) process and automation. The focus is to ensure automated infrastructure creation and maintenance for Petuum’s AI products development. If you are a self-starter that thrives working in new technologies to drive innovation and discovery, you should join our team.

Responsibilities

  • Manage tools and engineering infrastructure (both hardware and software) to support our development environment
    • Manage existing hardware infrastructure (JumpCloud, VPN servers, Artifactory, repositories and container registries, etc.)
    • Build and manage in house Kubernetes clusters with GPU node support
    • Manage ubuntu Linux servers and performance monitoring
    • Manage existing tools’ subscriptions and accounts, software licenses, new tools’ evaluation, setup, and configuration
  • Manage and maintain AWS resources including VPC, security groups, EC2, S3 and EKS clusters using Terraform and Ansible
  • Create cli using Python or Go to wrap a group of functions into a command
  • Manage in-house OpenStack cluster through cli or scripts
  • Create CI/CD pipeline for different types of Gitlab repo such as Python, npm, Helm chart using Gitlab runner
  • Create Helm chart for Kubernetes resources deployment
  • Manage Kubernetes clusters with different distributions such as Microk8s, EKS, RKE2

Minimum Qualifications

  • Minimum 5 years work experience in DevOps
  • Strong Linux and network problem-solving skills
  • Expert in scripting (Bash, Python or Go languages)
  • Proficient with Helm Chart development
  • Strong understanding of AWS
  • Familiarity with Kubernetes cluster administration
  • Familiarity with Terraform and Ansible

Preferred Qualifications

  • Ability to create Gitlab CI/CD job and setup Gitlab runner
  • Experience with deploying distributed software system with multiple repositories in public cloud (AWS, Azure, GCP)
  • Familiarity with monitoring tools such as Prometheus, Grafana
  • Familiarity with ELK Stack: Elasticsearch, Logstash, Kibana
  • What We Offer for your Valuable Work: Petuum offers Medical, Dental, Vision, Life/Disability, Paid Time Off, Parental Leave, and more

Petuum is a welcoming workplace that considers applicants for employment without regard to, and does not discriminate on the basis of, gender, race, protected veteran status, disability, or any other legally protected status. Petuum is an at-will employer.

Apply for this Job

* Required