At Pachyderm, we're building an open-source enterprise-grade data science platform that lets you deploy and manage multi-stage, language-agnostic data pipelines while maintaining complete reproducibility and provenance. Our system, developed with open source roots, shifts the paradigm of data science workflows by providing reproducibility, data provenance, and opportunity for true collaboration. Pachyderm utilizes modern technologies like Docker and Kubernetes to build an entirely new method of analyzing data. Offered both as an in-house solution as well as hosted-service, Pachyderm brings together version-control for data with the tools to build scalable end-to-end ML/AI pipelines while empowering users to use any language, framework, or tool they want. If you want to learn more about our grand vision, read what has become our"manifesto."
Pachyderm is a rapidly growing, early-stage company funded by the top VC’s — Benchmark, Decibel, M12, and YCombinator. Like many modern companies, Pachyderm embraces a “Remote-first” approach to growing our team. It gives us a huge advantage in hiring top talent and diverse talent across the country while giving our team members the flexibility to work from anywhere.
Love Go, Kubernetes, cloud deployment, and automation?
Pachyderm is hiring a DevOps expert to be a senior member our team to help improve infrastructure, deployment, and testing processes. Pachyderm has a rapidly-growing engineering team and we're long overdue for some major improvements to our internal infra and engineering methodologies. Your major projects will include:
- Develop our internal Go backend for the hosted platform
- Manage and maintain internal Kubernetes clusters and hosted Pachyderm clusters
- Optimize Pachyderm's CI to improve our development workflow and increase developer velocity.
- Develop Pachyderm's internal testing/benchmarking framework (probably in Go) to perform large-scale benchmarks on a regular cadence.
- Improve, test, script, and document the multitude of deployment options for Pachyderm's core product including all cloud providers and various permutations of on-prem k8s and object stores.
- Build standard monitoring, logging, and deployment (e.g. Helm chart) packages so that Pachyderm users can get up and running faster
- Work closely with our front-end, backend, and systems team to improve hosted cluster stability and uptime.
While your primary focus will be building and maintaining various internal systems, you'll also have the opportunity to contribute to the core product and work directly with users/customers who have complex deployment environments. At Pachyderm, OSS user and customer feedback is major driver of our product roadmap and we believe that everyone within the company should experience that first-hand.
Pachyderm is just a small team right now, so you'd be getting in right at the ground floor and have an enormous impact on the success and direction of the company and product. You can of course check out the product on GitHub.
We offer significant equity, full benefits, and all the usual startup perks.
Qualifications:
Some Golang or other programming experience is required. While much of the job is automation and scripting, our testing frameworks, product backend, and internal automation work (e.g. k8s operators/CRDs) are all written in Go.
4+ years of experience building, maintaining, and automating distributed systems, data infrastructure, back-end systems or related infrastructure.
Expertise running and managing Kubernetes and Docker in one or more cloud providers, preferably as part of a large-scale, enterprise-class product related to storage, processing, networking and/or virtualization
Expertise running and managing build, test, and release processes for 10+ person engineering orgs
Must have strong communication skills when talking about technical concepts. Our interview process strongly tests for communication as we have a very collaborative work environment where many parts of the codebase interact in complex ways.