Job Overview:

Embedded within the TuSimple Service Infrastructure group, the Technical Lead Manager (TLM), Platform Engineering leads a team of specialized and adept engineers who design, build, and operate/maintain infrastructure services, tools, and libraries using cutting-edge technology. The TLM, Platform Engineering works to deliver operational Artificial Intelligence (AI) platforms at scale and speed. The incumbent helps manage and maintain critical TuSimple infrastructure and is oriented towards automation, eliminating risks, and building reliable, scalable, and performant platforms/systems.

The TLM, Platform Engineering provides daily guidance, coaching and direction to the Simulation and Regen infrastructure team members, ensuring their development, empowerment, and motivation. Others depend on the TLM, Platform Engineering to help accelerate the development cycle of machine learning products. As such, the incumbent uses their deep and broad technical experience to standardize deployments, ensure the auditability of infrastructure, automate various deployment processes, write documentation, and facilitate training events.

They employ a global mindset and excel at collaborating with global teams to resolve obstacles. In addition, they build strong relationships within the team and beyond, by demonstrating appreciation and regard for others’ ideas and work product, and by helping the team consistently deliver results that meet and/or exceed expectations.

What You'll Do:

  • Leads the comprehensive process of designing, building, and operating TuSimple’s foundational software services for Regression Testing and Simulation platforms.
  • Provides leadership in the recruitment, training and development of top quality engineering talent, ensuring high levels of performance and productivity. Builds morale, motivates and instills productivity and teamwork, creates and promotes a positive and supportive work environment. Creates a culture of continuous improvement for processes, systems, data, training, people, etc.
  • Provides highly scalable, reliable, and secure Infrastructure to build distributed applications.
  • Develops highly available and fault tolerance systems to achieve 99% service level objectives (SLO’s) for business continuity without downtime or packet loss even during software upgrade.
  • Skillfully implements new features and evolves existing infrastructure.
  • Imparts knowledge and drives adoption via user guides, Application Programming Interface (API) references, and workshops.
  • Design the underlying infrastructure and technical architecture including big data computing, orchestration scheduling, and cloudilization. 
  • Develops both cloud platforms (AWS and/or alternatives) and on-premise solutions.
  • Improve the reliability and continuously develop the platforms responsible for running Simulation and Regression testing services at TuSimple.
  • Develops automation deployment and configures a variety of high-performance computing (HPC) architectures and hardware configurations. 
  • Researches performance across different hardware configurations utilizing HPC clusters, GPU acceleration, low-latency high-traffic networks and other high performance computing configurations on bare metal and cloud infrastructure.
  • Designs and implements tooling and automation for clustering, scaling, monitoring, and alerting.
  • Standardizes Kubernetes deployments and ensures that infrastructure is auditable.
  • Ensures infrastructure security compliance; implements security, permissions, and authentication. 
  • Assists with recruiting and training initiatives; helps select candidates with strong skills and great potential, mentors junior engineers, and grows the teams’ technical capabilities and capacity.

What You'll Bring:

  • Demonstrated ability to lead, inspire, and motivate an engineering  team to effectively and efficiently accomplish goals and collaborate. Ability to create a sense of “team” across various locations.
  • Advanced capability to design and implement reliable, scalable, and performant distributed systems and data pipelines.
  • Familiarity with the whole web stack, including protocols and web server optimization techniques.
  • Hands-on experience managing large numbers of diverse systems with configuration management or software delivery platforms. 
  • Experience with at least one orchestration system (i.e. OpenStack, Kubernetes, Yarn)
  • Proficiency in infrastructure as code (IaC) tools like Terraform, Vagrant, Chef, Puppet, or Amazon Web Services (AWS) CloudFormation.
  • Demonstrated background and experience in networking; experience with network software, e.g. TCP/IP, IP Tables, routing protocols, etc.
  • Demonstrated programming experience with proficiency in Go (Golang), Java, or Python.
  • Ability to resolve ambiguity and collect feature requirements and feedback from users.
  • Experience with automated deployment and integration tooling.
  • Ability to actively collaborate with global teams and resolve obstacles by evaluating all possible solutions and using informed judgement to select the best path forward for the project.
  • Experience designing or maintaining the machine learning platform is considered an asset.
  • Knowledge of, or experience with, Agile or Scrum project management environments/methodologies is considered an asset.
  • Experience with supporting and maintaining networking infrastructure is considered an asset.
  • Experience with system administration on AWS is considered an asset.
  • Experience with large-scale backend systems and infrastructure is considered an asset.
  • Experience with high availability and fault-tolerant systems is considered an asset.
  • Previous experience in any of the following areas is considered an asset: infra-level outages, making blameless postmortems, and GPU/CPU scheduling.

Perks

  • 100% employer-paid healthcare premiums for you and your family
  • Work visa sponsorship available
  • Relocation assistance available
  • Breakfast, lunch, and dinner served every day
  • Full kitchens on every floor with unlimited snacks, drinks, special treats, fruits, meals, and more
  • Stock options / equity
  • Gym membership reimbursement
  • Monthly team building budget
  • Learning/education budget  
  • Employer-paid life insurance
  • Employer-paid long and short disability

TuSimple is an Equal Opportunity Employer. This company does not discriminate in employment and personnel practices on the basis of race, sex, age, handicap, religion, national origin, or any other basis prohibited by applicable law. Hiring, transferring and promotion practices are performed without regard to the above-listed items.

Brown University, California Institute of Technology, Carnegie Mellon University, Columbia University, Cornell University, Dartmouth College, Duke University, Georgia Institute of Technology, Harvard University, Harvey Mudd College, Massachusetts Institute of Technology, North Carolina State University, Northwestern University, Princeton University, Purdue University, Rice University, Rose - Hulman Institute of Technology, Stanford University, Tufts University, University of California — Berkeley, University of California — Los Angeles, University of Illinois--Urbana-Champaign, University of Maryland--College Park, University of Massachusetts--Amherst, University of Michigan--Ann Arbor, University of Notre Dame, University of Pennsylvania, University of Southern California, University of Texas Austin, University of Washington, University of Wisconsin--Madison, Williams College, Worcester Polytechnic Institute (WPI), Yale University, MIT, CMU, Waymo, Uber, Facebook, Uber, Amazon, Cruise, Tesla, Argo AI, Baidu, DIDI, Zoox, Nutonomy, Nuro, Aptiv, Pony.Ai, Kodiak, Toyota, Nissan, GM, Ford, VW, Autonomous Car, Autonomous Driving, Robotics, Artificial Intelligence, Machine Learning, Deep learning, Perception, Prediction, Planning, Control, Anduril Industries, Sift, Nauto, Tempus,  Salesforce,  Automation Anywhere, SenSat, Phrasee, Defined Crowd, Pymetrics,Siemens, Socure, AEye, Rev.com, Suki.ai, Verkada, DataVisor, People.ai, AlphaSense, Icertis, Casetext, Blue River Tech, Nvidia, Bright Machines, Orbital Insight, Brighterion, H2O, Intel, Clarifa, X.ai, Zebra Medical Vision, Iris AI, Freenome, Neurala, Akamai, Zoho, ServiceNow, SalesForce, Oracle, Tableau,Splunk,Cvent, Veeam,Atlassian, DocuSign, Dropbox, Veeva Systems, Proofpoint, Cornerstone, Qualtrics. New Relic, Okta, Intralinks, MuleSoft, Freshworks, Slack, Twilio, Anaplan, Stripe,  Workfront, Smartsheet, Zuora, OutSystems, Coupa, Cylance, Elastic, Zoom, SailPoint, BlackLine, iCIMS, Digitate, Qualys, Kareo, DataStax, DiscoverOrg, Siteimprove, Druva, Centrify, Looker, SimilarWeb, Odoo, Kyriba, Sumo Logic, Sisense, PagerDuty, DigitalOcean, Liquid Web, Zaloni, Databricks, ServiceTitan, Fastly, SnapLogic, Mendix, Couchbase, Egnyte, Seismic, Bill.com, Justworks, Collibra, ActiveCampaign, Schoology, SalesLoft, Cylynt,

Apply for this Job

* Required

  
When autocomplete results are available use up and down arrows to review
+ Add Another Education

U.S. Equal Opportunity Employment Information (Completion is voluntary)

Individuals seeking employment at TuSimple are considered without regards to race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, veteran status, gender identity, or sexual orientation. You are being given the opportunity to provide the following information in order to help us comply with federal and state Equal Employment Opportunity/Affirmative Action record keeping, reporting, and other legal requirements.

Completion of the form is entirely voluntary. Whatever your decision, it will not be considered in the hiring process or thereafter. Any information that you do provide will be recorded and maintained in a confidential file.

Race & Ethnicity Definitions

If you believe you belong to any of the categories of protected veterans listed below, please indicate by making the appropriate selection. As a government contractor subject to Vietnam Era Veterans Readjustment Assistance Act (VEVRAA), we request this information in order to measure the effectiveness of the outreach and positive recruitment efforts we undertake pursuant to VEVRAA. Classification of protected categories is as follows:

A "disabled veteran" is one of the following: a veteran of the U.S. military, ground, naval or air service who is entitled to compensation (or who but for the receipt of military retired pay would be entitled to compensation) under laws administered by the Secretary of Veterans Affairs; or a person who was discharged or released from active duty because of a service-connected disability.

A "recently separated veteran" means any veteran during the three-year period beginning on the date of such veteran's discharge or release from active duty in the U.S. military, ground, naval, or air service.

An "active duty wartime or campaign badge veteran" means a veteran who served on active duty in the U.S. military, ground, naval or air service during a war, or in a campaign or expedition for which a campaign badge has been authorized under the laws administered by the Department of Defense.

An "Armed forces service medal veteran" means a veteran who, while serving on active duty in the U.S. military, ground, naval or air service, participated in a United States military operation for which an Armed Forces service medal was awarded pursuant to Executive Order 12985.


Form CC-305

OMB Control Number 1250-0005

Expires 05/31/2023

Voluntary Self-Identification of Disability

Why are you being asked to complete this form?

We are a federal contractor or subcontractor required by law to provide equal employment opportunity to qualified people with disabilities. We are also required to measure our progress toward having at least 7% of our workforce be individuals with disabilities. To do this, we must ask applicants and employees if they have a disability or have ever had a disability. Because a person may become disabled at any time, we ask all of our employees to update their information at least every five years.

Identifying yourself as an individual with a disability is voluntary, and we hope that you will choose to do so. Your answer will be maintained confidentially and not be seen by selecting officials or anyone else involved in making personnel decisions. Completing the form will not negatively impact you in any way, regardless of whether you have self-identified in the past. For more information about this form or the equal employment obligations of federal contractors under Section 503 of the Rehabilitation Act, visit the U.S. Department of Labor’s Office of Federal Contract Compliance Programs (OFCCP) website at www.dol.gov/ofccp.

How do you know if you have a disability?

You are considered to have a disability if you have a physical or mental impairment or medical condition that substantially limits a major life activity, or if you have a history or record of such an impairment or medical condition.

Disabilities include, but are not limited to:

  • Autism
  • Autoimmune disorder, for example, lupus, fibromyalgia, rheumatoid arthritis, or HIV/AIDS
  • Blind or low vision
  • Cancer
  • Cardiovascular or heart disease
  • Celiac disease
  • Cerebral palsy
  • Deaf or hard of hearing
  • Depression or anxiety
  • Diabetes
  • Epilepsy
  • Gastrointestinal disorders, for example, Crohn's Disease, or irritable bowel syndrome
  • Intellectual disability
  • Missing limbs or partially missing limbs
  • Nervous system condition for example, migraine headaches, Parkinson’s disease, or Multiple sclerosis (MS)
  • Psychiatric condition, for example, bipolar disorder, schizophrenia, PTSD, or major depression

1Section 503 of the Rehabilitation Act of 1973, as amended. For more information about this form or the equal employment obligations of Federal contractors, visit the U.S. Department of Labor's Office of Federal Contract Compliance Programs (OFCCP) website at www.dol.gov/ofccp.

PUBLIC BURDEN STATEMENT: According to the Paperwork Reduction Act of 1995 no persons are required to respond to a collection of information unless such collection displays a valid OMB control number. This survey should take about 5 minutes to complete.