Lambda's mission is to accelerate human progress with computation. Our Deep Learning workstations, servers, and cloud services power ML engineers at the forefront of AI research, fueling advancements in quantum computing, cancer detection, autonomous aircraft, drug discovery, self driving cars, and much more. 

Lambda provides Artificial Intelligence and Machine Learning infrastructure to organizations like Apple, Intel, Microsoft, Amazon Research, Tencent, Kaiser Permanente, MIT, Harvard, Stanford, Caltech and the Department of Defense.  

Join us and work at a profitable startup where we’re building powerful research computers and software for Machine Learning and Artificial Intelligence experts around the world. 

About the Role

You’ll build software that will be used by some of the world’s top AI research labs. You’ll write the software tools that will assist Fortune 500 companies and top research universities train state of the art neural networks. You’ll make it possible to scale from a single server up to an entire data center with minimal setup and maintenance.   The software you’ll write will enable some of the world’s top scientists and technologists to make world-changing advances in Artificial Intelligence.


What You’ll Do

  • Learn about what it takes to build and run HPC clusters for Deep Learning
  • Build operating software for managing GPU hardware infrastructure for Machine Learning
  • Automate the process of creating, provisioning, and expanding HPC clusters for use in machine learning applications
  • Build telemetry systems for clusters to enhance visibility, utilization, and performance
  • Create monitoring and alerting dashboards to improve cluster uptime and reliability
  • Create workflow tools to help scientists manage their experiments


Experience that’s great to have

  • Extensive experience developing web based graphical interfaces
  • Extensive experience with Linux
  • Strong Python and Bash scripting skills
  • Experience developing in a systems programming language (C, C++, Go, Rust, similar)
  • Experience building informational dashboards.  Grafana or similar is ideal.


Nice to Have 

  • Built microservices and related APIs
  • Systems level design and development of OS-level tools
  • Experience programming GPUs
  • Experience working with High Performance Computing clusters
  • Understanding of and experience with interconnects, both on-board (PCIe) and network-level (Ethernet, TCP/IP)
  • Previously worked with hardware management systems like IPMI


About Lambda

  • We offer generous cash & equity compensation.
  • Investors include Gradient Ventures, Google’s AI-focused venture fund.
  • We are experiencing extremely high demand for our systems, with quarter over quarter, year over year profitability.
  • Our research papers have been accepted into top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOG.
  • We have a wildly talented team of 30, and growing fast.
  • Our remote workforce, based on role, is across the U.S., with headquarters in San Francisco.
  • Health, dental, and vision coverage for you and your dependents.
  • Commuter/Work from home stipends.
  • 401k Plan.
  • 3 weeks Annual Paid Time Off.


Equal Opportunity Employer

Lambda Labs is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.

Apply for this Job

* Required