Senior Site Reliability Engineer - Cyber Security Business Unit
Location: Ideally Austin, TX, Denver, CO or Redwood City, CA
We will, however, also look at 100% remote talent based elsewhere in the USA
Who We Are:
Headquartered in Austin, Texas and formerly known as JASK Labs Inc (https://bit.ly/2GxBwKb) The Sumo Logic Cyber Security Business Unit is modernizing security operations by delivering market-leading enterprise security analyst threat detection and prioritization data in market-disruptive cloud-native delivery model. We are in the business of modernizing enterprise security operations for the challenges of the 21st century.
Today, security analysts are fighting a losing battle. More than half of alerts go uninvestigated, leaving attackers free to infiltrate organizations even with existing defenses. The Sumo Logic Cloud SIEM provides security analysts with the enhanced visibility and context needed to speed the time to identify evidence of exploits, reduce the time to remediate and improve the ability for security teams to more quickly and thoroughly understand the impact of an attack.
At Sumo Logic, our SRE Team is at the core of driving our product decisions that directly impact customers around the world.
To support our business needs, we are building one of the largest big data infrastructures for cybersecurity in the cloud. In addition to relying on big data compute engines, we are also building an ecosystem of tools and services that allow all Sumo Logic teams to leverage the platform as a cohesive service.
Within the Sumo Logic platform, we have a team whose sole focus is to build tools and services to improve the reliability of the platform. As a member of the team, you will help drive operational excellence for this ecosystem of complex large-scale systems by re-imagining how we would automate and build tools to lower operational barriers, improve visibility on problematic areas, support scale of the platform, and provide a safe and secure environment for other teams.
We are seeking a Senior SRE who is an expert at building automation, performance tuning, and metrics. You would also have solid coding skills in Python or Go languages, along with experience working with large scale data pipelines (Kafka/Kubernetes deployment)
What You Will Be Doing:
- Develop effective tooling, alerts, and response to both identify and address reliability risks.
- Build tools and automation to reduce operational tasks, improve automatic issue identification and routing, and predict platform performance in accordance to SLAs based on overall platform health and progress.
- Participate in an on-call rotation to manage incidents and to handle unknown/new issues.
- Drive issue resolution and root cause identification with the various data infrastructure teams.
- Evangelize best practices around collaboration, security, and reliability to all partner teams.
Who You Are:
- 7+ years experience in a Site Reliability Engineer Engineer role.
- Must have a deep understanding of Kafka technologies.
- Expertise with AWS Architecture
- Containerization technology expertise is essential (ideally Kubernetes)
- Scripting fluency in at least Python & Go language and a commitment to automate solutions
- Experience with Linux systems
- Deep experience with configuration management solutions
- Ability to quickly take on new roles and responsibilities
- Track record of successfully deploying and running production systems
- A desire to see features through from development to deployment in production environments
- BS in Computer Science or equivalent experience (Masters Degree in CS huge plus)
About Us: https://app.box.com/v/SLGeneralDossier
· Massive Scale:
Our microservices architecture in AWS ingests hundreds of terabytes daily across many geographic regions. Millions of queries a day analyze hundreds of petabytes of data.
Democratize machine data analytics through the Sumo Logic platform, bringing real-time data insights securely through the cloud.
· Funding and Growth:
We have raised $345 million in funding to date, with the most recent round being May 2019. Investors include Battery Ventures, Greylock Partners, Sutter Hill Ventures, Accel Partners, Sequoia Capital, Sapphire Ventures and DFJ Growth. Our recurring revenue and customer base are growing steadily. We serve over 2,000 customers across the globe including AirBnB, Alaska Airlines, Anheuser Busch, Hootsuite, Hearst, Hudl, Major League Baseball, Marriott, Medidata, Sauce Labs, Samsung SmartThings, SPS Commerce, Twitter, Telstra, Toyota, Zuora and more.