WHAT IS BOX
Box is the market leader for Cloud Content Management. Our mission is to power how the world works together. Box is partnering with enterprise organizations to accelerate their digital transformation by creating a single platform for secure content management, collaboration and workflow. We have an amazing opportunity to further establish ourselves as leaders in the space, and we need strong advocates to help us achieve that goal.
By joining Box, you will have the unique opportunity to help capture a majority of this developing market and define what content management looks like for the digital enterprise. Today, Box powers over 97,000 businesses, including 70% of the Fortune 500 who trust Box to manage their content in the cloud.
WHY BOX NEEDS YOU
Box is looking for a dynamic Technical Duty Officer to help lead our Network Operations Center and support an industry-leading platform. It is the responsibility of the NOC team to monitor, troubleshoot, and resolve issues that affect the availability and quality of the Box platform. The NOC team is the frontline of defense in making sure our customers like GE, Pandora, Apple and Gap have a seamless experience when accessing their content on Box.
This is an integral job function within the NOC that ensures the overall production site health and the performance of core customer facing journeys. This role will help maintain total site awareness, detecting metric and service deviations, monitoring changes, and proactively identifying potential issues and resolving before they escalate to customer impacting levels.
We are building a world class NOC and need the best talent possible to get us there. That's where you come in!
WHAT YOU'LL DO
- Own live-site Incident Management
- Spring into action during customer-impacting events and lead a team to quickly solve the problem
- Operate across interpersonal boundaries to protect our customers, their data, and the availability of all Box services
- Troubleshoot critical problems through applications, systems, clouds, and networks
- Provide technical leadership and key insights to improve Box's Reliability Engineering capabilities
- Build tools and processes to improve manageability, observability, resiliency and time to restore service for critical incidents
WHO YOU ARE
- You take initiative when you see a problem; you are a life-long learner who seeks out knowledge
- You are confident and comfortable communicating from the individual-contributor level up through C-level staff
- You have a rock solid command presence and are calm and collected in stressful situations, such as a major service outage.
- You're driven to learn new skills and technologies
- You have 5+ years of large-scale production operations or development experience and enjoy talking reliability engineering
- Bachelor's degree in Computer Science or Information Systems or equivalent technical field, or similar work experience in a large-scale 24/7 production environment supporting critical, real-time applications
- Solid grasp of Linux Red Hat, Unix, Perl and Shell scripts
- Experience working in virtualized environments and cloud implementations
- Solid understanding of the TCP/IP suite, routing protocols such as BGP and OSPF and DNS
- Kubernetes/Public Cloud experience
- Outstanding interpersonal and communication skills.
- Incident management in a large scale, high uptime environment a plus
- Flexibility to work shift model
- Visit this webpage to check out all of our exciting benefits: https://join.collectivehealth.com/box
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
For details on how we protect your information when you apply, please see our Personnel Privacy Notice.