We are seeking a Cloud DevOps Engineer who will have a critical role in owning, managing and growing a unified logging analytics service for Fuze services and systems.
This is your opportunity to join an exciting business that is experiencing significant growth. You’ll be involved in architecting and managing a high-volume cloud-based logging environment serving multiple teams and offering actionable data, alerts, reports, etc.. You will have the opportunity to work on cutting-edge technologies and on business-impacting products.
Responsibilities:
Design, implement, maintain and be the subject matter expert for the monitoring and logging infrastructure (primarily Elasticsearch/ELK).
Lead the onboarding process for new log sources delivering accurate integration and content parsing and extractions.
Create, modify and troubleshoot data sources for various applications (internal and external), as well as manage knowledge objects while consulting with stakeholders to meet their requirements.
Perform maintenance, optimization and of existing Elastichsearch deployments.
Develop and promote log management best practices and procedures for the internal teams.
Create and maintain documentation relevant to the tasks: ex. Deployment manuals, architecture diagrams, etc.
Participate in technical escalations and on-call rotation.
Requirements:
B.S. degree in Computer Science or relevant field experience.
Detailed understanding of infrastructure operations and in-depth knowledge and experience around logging solutions including log management, logging analytics and monitoring (Elasticsearch/Kibana/Logstash).
4+ years of experience with DevOps technologies, cloud-based provisioning, monitoring, and troubleshooting (preferably in AWS).
Hands-on experience with designing and operating large scale Elasticsearch architecture and component deployments.
Experience on-boarding new data sources and setting up alerts (formatting, standardization, etc.)
Extensive experience in orchestration and automation: ex: Terraform, CloudFormation, Ansible, CI/CD concepts and tools.
Demonstrated knowledge in building and managing large-scale deployments.
Highly proficient in administering Linux environments (Ubuntu/CentOS), including configuration of networking and security.
Demonstrable expertise around specifying, designing, and implementing system health, performance monitoring tools, and software management tools for 24x7 environments.
Strong debugging and systems analysis skills around identification and rapid issue resolution.
Scripting skills including bash, python, etc.
Excellent communication skills and the ability to work well in a geographically dispersed team.