As a Network Reliability Engineer, you will work to ensure the safe, swift, and reliable delivery of services to our customers. This role combines software and systems engineering to deliver highly scalable, distributed, fault tolerant systems. You enjoy creating solutions to operations problems. You have a holistic knowledge of our systems and services and can re-engineer processes when they need it and then clearly communicate the necessary change. You understand how various teams operate and are able to reduce their effort to deliver new processes and services. You have the grace to stay calm when production services are down and the courage to ask for help from the right people as needed to bring them back up. You enjoy collaborating with people from other teams and disciplines to make plans a reality.
- Develops solutions to increase systems and service stability through automation and process re-engineering
- Installs, manages, and maintains both physical and cloud infrastructure to enable delivery of compute, storage, and network resources
- Develops SLIs and SLOs for production services and helps monitor overall system health
- Builds and supports tools and systems that engineers use to deploy their services into production
- Participates in rotating on-call duties in a global, 24x7x365, team
- Updates job knowledge by studying state-of-the-art tools and techniques; participating in educational opportunities; reading professional publications; maintaining personal networks; participating in professional organizations
- Helps development teams operationalize their efforts to enable self-ownership of production services
- Experience with UNIX based systems including UNIX tools like SSH, grep, sed, awk, find, etc.
- An understanding of networking and core Internet protocols (e.g. TCP/IP, BGP, IS-IS, DHCP, NAT, IPSEC, ECMP, DNS, TLS, SMTP, HTTP)
- Hands on experience working with both wired and wireless l2/l3 equipment, e.g. Cisco, Juniper, Aruba, Palo Alto
- Demonstrate in-depth understanding of LAN technologies and topologies, including VLANs, 802.1Q, 802.3ad
- Familiar with optical network technologies (DWDM), SFP, SFP+, QSFP+, single mode and multimode fiber
- Experience using a modern language. Go, Java, Node.js, Ruby, etc.
- Ability to script in a shell language (Bash or POSIX Shell)
- Experience with public cloud providers (AWS, Google Cloud Platform, etc.) is a plus.
- Experience working with containers (Docker, Kubernetes, ECS, etc.) is a plus.
- Comfort with frequent, incremental code testing and deployment
- Understanding of the role of automation tools (Terraform, Jenkins, Concourse CI, Bitbucket Pipelines, etc.)
- Comfort with collaboration, open communication and reaching across functional borders
- Ability to remain calm under pressure and take command of a recovery effort.
- BA/BS in Computer Science or equivalent experience
- All your information will be kept confidential according to Equal Employment Opportunities guidelines.