What is Box?

Box is the market leader for Cloud Content Management. Our mission is to power how the world works together. Box is partnering with enterprise organizations to accelerate their digital transformation by creating a single platform for secure content management, collaboration and workflow. We have an amazing opportunity to further establish ourselves as leaders in the space, and we need strong advocates to help us achieve that goal.
 
By joining Box, you will have the unique opportunity to help capture a majority of this developing market and define what content management looks like for the digital enterprise. Today, Box powers over 100,000 businesses, including 70% of the Fortune 500 who trust Box to manage their content in the cloud.
 
Why Box Needs You? 
 
The main focus of the Observability Team is to build frameworks and systems that can manage the performance of Box systems while scaling to billions of events per second. Additionally, we are responsible to standardize observability across engineering teams, drive designs for high performing services and foster great observability practices. We build, scale, and operate low-latency, high-throughput data systems that power high resiliency of Box Systems. You will help us execute on this vision and ensure that Box continues to ship scalable services that can hold against the high-performance expectation from our customers.
 
The Observability Platforms team provides an end-to-end experience enabling Box engineers by leveraging frameworks, tools, APIs and visualisations to better understand the behavior of features, services, and infrastructure they own and maintain. The team also helps educate product, infrastructure, and systems teams on how to appropriately monitor features and services they own, provide visualisations for monitoring distributed systems, give guidance for reducing operational overhead, and supports the delivery of unmatched availability to our customers.
 
We need a Staff Site Reliability Engineer with the experience of having designed, operated, and implemented Observability frameworks at a very large scale, and well versed in the operation of scaled architectures. You should have deep operational knowledge of distributed systems and how to avoid limitations through innovative design.
 
We are looking for big thinkers and innovators who have experience working with scalable distributed systems and have a passion for high performance and reliability. We are a small team with big ambitions that values impact and is not afraid of huge, gnarly problems. If this excites you, come join us!
 
What You'll Do? 
 
You're going to have the unique opportunity to build, improve, and support our Observability (o11y) platform. You will get to work with cutting-edge technologies that are defining the future of Box's cloud platforms.  You will have visibility and impact across all of Engineering. 

Reliability and great customer experience is a constant focus at Box and this creates a continued set of requirements for observability of complex systems. We need your expertise with observability technologies like logging, metrics, search, analytics and tracing to build impactful instrumentation and efficient data pipelines for application performance management at Box. We hope you can contribute best practices for monitoring at scale and use them to strengthen the observability strategy and culture at Box.
 
Who You Are? 
  • You are a developer at heart with a passion for solving hard problems using data-driven solutions
  • You have experience building data pipelines with a strong focus on availability, resilience, and durability
  • You act like an owner and strive to do work you're proud of, both technically and in your team interaction
  • 5+ years of experience in building scalable distributed systems
  • Experience and knowledge on micro-service based architecture
  • Linux familiarity, and an automation mindset
  • Experience with Kubernetes
 
Skills that will be a big bonus:
  • Splunk experience
  • Knowledge of an open-source stream-processing software platform like Kafka or AWS Kinesis
  • Hands-on experience with modern cloud technologies like GCP, AWS or Docker
  • Configuration management and deployment management tools knowledge like Puppet, Ansible or Terraform
  • Experience with Apache Storm, Apache Spark or Flink / Beam frameworks
  • SignalFX, Wavefront, or other SaaS Time Series Database system knowledge 
  • Comfortable taking part in 24/7 oncall rotation along with the US and Poland teams

Want to learn more?

 
Equal Opportunity 
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. For details on how we protect your information when you apply, please see our Personnel Privacy Notice. 
 
 
#LI-DW1
#LI-Remote

Apply for this Job

* Required