Qualtrics is a single system of record for all experience data, also called X-data™, allowing organizations to manage the four core experiences of business—customer, product, employee and brand experiences—on one platform. Over 8,500 enterprises worldwide, including more than 75 percent of the Fortune 100, and 99 of the top 100 U.S. business schools, rely on Qualtrics. To learn more and for a free account, please visit www.qualtrics.com.
The Core Platform team is responsible for building mission-critical systems and services which are leveraged by all the Qualtrics’ product line teams and accelerate their efforts toward providing customer value. Examples range from ownership of common libraries, to our A/B testing service, or our asynchronous job ecosystem which includes scheduling, queueing, progress tracking, workers, and notifications. Our ambition for 2018 is to provide a unified messaging platform based on Kafka which will make it easy for teams to utilize the benefits of async pub/sub architectures. The ideal candidate will have experience running a Kafka cluster and related dependencies such as Zookeeper.
As a Support Engineer on the Core Platform team, you will have a significant impact to our operational success as we strive to be a “gold standard” for Qualtrics Engineering teams. This includes wearing a variety of hats: from deploying code, to creating and enhancing deployment pipelines, to applying security updates, to adding and tuning alerts, to helping improve our metrics collection and reporting to ensure the team’s success going forward. There’s plenty of opportunity for creativity: we are constantly looking for ways to improve, and the right candidate will be encouraged to play to their strengths in terms of improvements driven by load and/or performance testing, moving to better engineering practices, coding prototypes or bug fixes, and really any contribution that helps the team reach our goals with high quality.
- Build systems to measure reliability of services and proactively discover trends needing attention
- Fine tune services to reduce latency, conduct operational readiness reviews and automate continuous delivery of software changes
- Maintain service level agreements, and build systems to support it
- Manage the health of distributed specialized server fleets and the software running on them
- Execute regular maintenance activities for services including outage handling, security enhancements, and root cause fixes
- Assist the team in our goal to release a unified messaging platform
- Enhance team runbooks and wikis to make everyone better
- Bachelor's degree in CS preferred, or in a hard science or Information Systems
- 2+ years of software development or operations experience
- A high degree of organization and attention to detail
- Excellent leadership, verbal, and written communication skills
- Demonstrated skill and passion for operational excellence
- Experience running Kafka and Zookeeper clusters
- Experience with AWS technologies, Docker, Jenkins
- Experience with shell scripts and/or other scripting languages
- Experience with Unix/Linux platforms
- Proficiency troubleshooting and identifying the root cause of issues
- Experience running and maintaining highly available distributed systems
- Capability to retain composure and communicate effectively during operational incidents
- Proven ability to understand large systems, drilling down to code level