Petuum’s mission is to unlock human productivity and well-being by advancing the limits of AI technology standards and engineering to build trustworthy AI products. The Petuum team is looking for talented, motivated full-time ML Systems and AutoML (automated machine learning) Engineers who are able to deliver consistently in a fast-paced and high-quality manner. You will be responsible for helping build robust, effective, and well-packaged modern machine learning systems, as well as contribute to our CASL open source projects.
Responsibilities:
· Collaborate with system architects, designers, and engineers to support the development of robust machine learning systems.
· Contribute high-quality code and lead efforts in building Petuum’s open-source CASL projects such as AdaptDL, AutoDist, Tuun.
· Develop parallel programming techniques to simplify distributed ML programming.
· Learn and implement state-of-the-art deep AutoML algorithms to support tasks such as hyperparameter optimization, neural architecture search, data augmentation, feature engineering, and more.
· Assess and recommend technology choices and directions in consideration of cost-benefit trade-offs.
· Communicate your work to a broader audience through talks, tutorials, and blog posts.
Minimum Qualifications:
· Hands-on experience in one or more areas listed below:
o AutoML areas such as hyperparameter tuning, architecture search or manual design, data preparation, augmentation, or feature engineering
o Distributed systems
o Network communication, or storage systems
· Hands-on experience with at least one popular deep learning framework such as PyTorch and Tensorflow.
· High-level engineering skills in Python and C++.
Preferred Qualifications:
· Master’s degree in Computer Science, Machine Learning, or related fields with 2+ years of industry/research experience, or Ph.D. degree in Computer Science, Machine Learning, or other relevant degrees.
· Experience with model-based optimization (e.g. Bayesian optimization) methods or software frameworks.
· Experience in deploying machine learning algorithms in resource-restricted environments such as mobile or embedded systems.
· Experience in developing with Docker, Kubernetes, Ray, NNI, etc.
· Experience in contributing to notable open-source ML software, such as TensorFlow, PyTorch, etc.
· Publication (or submission) of a paper to machine learning or operating systems conferences.