Site Reliability Engineer, Metrics
Wayfair's metrics team is a lean, fast paced group responsible for a large distributed metrics stack. We use Python, Puppet, Terraform, and increasingly Golang to maintain a vast array of infrastructure in the cloud and on premise. Our infrastructure is a powerful enabler of developer velocity that gives insights into how software and systems are performing at Wayfair. Our stack is a critical tool for our Operations Center, helping them maintain 24x7 oversight into the Wayfair e-commerce backend and critical supporting platforms.
What You'll Do
- Contribute to our provisioning and orchestration infrastructure using Terraform
- Write custom analytics scripts to help us assess our current and future capacity needs
- Automate, Automate, Automate!
- Extend and improve our homegrown automated test framework.
- Write custom scripts to analyze our distributed data pipeline, giving us insight into how much data is coming from each data center, where it's coming from, and what kind of content it is.
- Learn and use "Tremor Script", a custom internal scripting engine used by our distributed data pipeline.
- Help us understand and contribute to several open source projects upon which our stack relies.
What You'll Need
- Experience programming in one of the following languages: Python, Go, C++,C#, Java
- Proven understanding of Linux and ability to effectively navigate your way around the shell and the Linux ecosystem in general.
- Proven quantitative ability to help us use data driven techniques to scale our infrastructure and to assess current capacity.
- Experience working in one or more cloud services and some proven understanding of provisioning compute resources in virtualized environments.
- Experience developing or configuring monitoring and alerting tools
- Comfort working with open source components and ability to analyze code, suggest and/or submit improvements
- Knowledge of Docker
- Knowledge of Kubernetes
- Advanced skills in Excel or another commonly used spreadsheet
- Knowledge of Kafka
- Familiarity with common network communications protocols such as TCP, UDP, etc.
Wayfair is one of the world’s largest online destinations for the home. Whether you work in our global headquarters in Boston or Berlin, or in our warehouses or offices throughout the world, we’re reinventing the way people shop for their homes. Through our commitment to industry-leading technology and creative problem-solving, we are confident that Wayfair will be home to the most rewarding work of your career. If you’re looking for rapid growth, constant learning, and dynamic challenges, then you’ll find that amazing career opportunities are knocking.
No matter who you are, Wayfair is a place you can call home. We’re a community of innovators, risk-takers, and trailblazers who celebrate our differences, and know that our unique perspectives make us stronger, smarter, and well-positioned for success. We value and rely on the collective voices of our employees, customers, community, and suppliers to help guide us as we build a better Wayfair – and world – for all. Every voice, every perspective matters. That’s why we’re proud to be an equal opportunity employer. We do not discriminate on the basis of race, color, ethnicity, ancestry, religion, sex, national origin, sexual orientation, age, citizenship status, marital status, disability, gender identity, gender expression, veteran status, or genetic information.