About Graphcore
How often do you get the chance to build a technology that transforms the future of humanity? Graphcore products have set the standard in made-for-AI compute hardware and software, gaining global attention and industry acclaim. Now we are developing the next generation of artificial intelligence compute with systems that will allow AI researchers to develop more advanced models, help scientists unlock exciting new discoveries, and power companies around the world as they put AI at the heart of their business. Graphcore recently joined SoftBank Group – bringing large and ongoing investment from one of the world’s leading backers of innovative AI companies.
Job Summary
As a Senior Cloud Software Engineer, you will lead the efforts in enabling new AI accelerator HW within Kubernetes environments. You will be responsible for the design, development, and maintenance of plugins in Go, ensuring seamless integration of a new AI accelerator with existing Kubernetes clusters, and providing a native Kubernetes end user experience. This role requires extensive experience in software development, and container orchestration technologies and cloud computing.
Responsibilities and Duties
- Lead the design and development of plugins in Go for the new AI accelerator integration in Kubernetes.
- Ensure seamless integration of the new hardware with existing Kubernetes clusters.
- Mentor and guide junior engineers, fostering a culture of continuous learning and improvement.
- Collaborate with cross-functional teams to design, implement, and test new features.
- Conduct thorough code reviews and provide constructive feedback to team members.
- Troubleshoot and resolve complex technical issues.
- If necessary, engage with the Kubernetes community, contributing to discussions, forums, and open-source projects.
- Write and maintain comprehensive documentation for your code and the overall project.
- Stay up-to-date with the latest trends and technologies in Kubernetes and cloud compute.
Skills and Experience
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- At least 10 years of experience in software development, ideally with a focus on cloud environments.
- Proficiency in Go or Python programming.
- Extensive experience with Kubernetes with a preference for candidates holding a Certified Kubernetes Administrator (CKA) and Certified Kubernetes Security Specialist (CKS) certifications.
- Familiarity with machine learning-related technologies within the Kubernetes ecosystem e.g. Kubeflow, KubeVirt, Kata containers, Volcano is highly desirable.
- Strong understanding of container orchestration and cloud-native development.
- Familiarity with other workload managers, such as Ray and SLURM, is considered an asset.
- Proven track record of achieving goals while implementing complex technical solutions.
- Knowledge of RDMA networks is considered an asset.
- Knowledge of cloud computing platforms such as Azure, GCP, AWS and their services.
- Experience with CI/CD pipelines and DevOps tools e.g. GitHub/GitLab.
- Leadership and mentoring skills.
- Excellent problem-solving skills and attention to detail.
- Strong communication and collaboration skills.
- English- C1 level.