Company Overview

Arcesium is a global financial technology firm that solves complex data-driven challenges faced by some of the world’s most sophisticated financial institutions. We constantly innovate our platform and capabilities to meet tomorrow’s challenges, anticipate the risks our clients encounter, and design advanced solutions to help our clients achieve transformational business outcomes.   

Financial technology is a high-growth industry as change and innovation continue to disrupt the status-quo and prompt major transformation. Arcesium is at a particularly interesting time in our own growth as we look to leverage our successfully established market position and expand operations in pursuit of strategic new business opportunities. We value intellectual curiosity, proactive ownership, and collaboration with colleagues, and we empower you to meaningfully contribute from day one and accelerate your professional development.

What You'll Do

  • Design, develop, and implement scalable and reliable monitoring solutions for distributed systems at scale.
  • Define and implement monitoring requirements in collaboration with cross-functional teams.
  • Lead the development of monitoring architectures and strategies.
  • Integrate monitoring tools into existing infrastructure.
  • Maintain and support monitoring systems.
  • Demonstrate strong technical breadth/depth, driving innovation, evaluating new technologies, and deciphering the technical vision for engineeringteams.
  • Own key contributions to technical design and architecture decisions, considering trade-offs of choices, managing risk, making decisionsindependently where appropriate, and presenting reasoned options for decision making by others.
  • Lead the way by writing exemplary code, documentation, and RFCs.
  • Identify, propose, develop, deploy, and own R&D projects in accordance with the technical vision and needs of the team, turning problemstatements into solutions, and operating independently as needed.

What You'll Need

  • 10+ years of experience in SRE or a related field.
  • Proven experience in designing, developing, and implementing monitoring solution.
  • Deep understanding of monitoring technologies and tools, including Prometheus, Grafana, Loki, and Tempo
  • Experience with cloud-based monitoring systems, such as New Relic, Datadog, and Grafana Cloud
  • Experience with log analysis tools, such as Splunk, Logstash, Fluent, and Sumo Logic
  • Experience with distributed tracing implementation using Open Telemetry, Jaeger
  • Strong understanding of SRE principles and practices.
  • Experience with incident response and management.
  • Reliability: An exposure to Chaos Engineering and various reliability practices including disaster recovery will be good to have.
  • Experience with Cloud Computing like AWS.
  • Experience with Kubernetes.
  • Experience in Agile practices (Scrum)
  • Excellent analytical, problem-solving, and troubleshooting skills.
  • Excellent communication and presentation skills.
  • Experience managing and mentoring engineers.
  • Ability to work independently and as part of a team.

 

Arcesium's Personal Data Privacy Notice for Candidates is linked here

 

Arcesium and its affiliates do not discriminate in employment matters on the basis of race, color, religion, gender, gender identity, pregnancy, national origin, age, military service eligibility, veteran status, sexual orientation, marital status, disability, or any other category protected by law. Note that for us, this is more than just a legal boilerplate. We are genuinely committed to these principles, which form an important part of our corporate culture, and are eager to hear from extraordinarily well qualified individuals having a wide range of backgrounds and personal characteristics.

Apply for this Job

* Required
resume chosen  
(File types: pdf, doc, docx, txt, rtf)
cover_letter chosen  
(File types: pdf, doc, docx, txt, rtf)


Our system has flagged this application as potentially being associated with bot traffic. Please turn off any VPNs, clear your browser cache and cookies, or try submitting your application in a different browser. If this issue persists, please reach out to our support team via our help center.
Please complete the reCAPTCHA above.