Principal Backend Engineer - Grafana
What is Grafana Cloud?
Grafana Cloud is our composable observability platform that integrates visualizations on metrics, logs, and traces with Grafana. It allows our customers to leverage the best open source observability software – including Prometheus, Mimir, Loki, and Tempo – without the overhead of installing, maintaining and scaling their own observability stack
The Grafana team within engineering is responsible for Grafana, the highly successful open source project with over a million instances running in the wild as well as our Enterprise ready Grafana Enterprise offering. Grafana is also the main frontend for Grafana Cloud where users can visualize their telemetry data as well as use our opinionated solutions for easier troubleshooting of both their infrastructure and their applications. As our SaaS business continues to grow, we've started to change Grafana's core architecture with the goal to be fully multi-tenant and scalable, as well as a solid platform for our opinionated Cloud apps.
We're looking for a principal level engineer with a distributed systems background who can drive this change and turn Grafana into a proper app platform where open as well as proprietary apps can directly tap into dashboards, alerts, incidents, and telemetry, and deliver even more integrated experiences.
As a company we are remote-first and global, we embrace people of different experiences and backgrounds to build diverse teams where every person brings a new perspective to the software. Our tech stack is mostly made up of services written in Go, running on multiple Kubernetes clusters using MySQL as storage, while developing our new storage layer.
What will you be doing?
- Take an active role in influencing our roadmap and your own career objectives
- Work with your team to deliver new features, then use the results to iterate and improve.
- Drive innovations from initial ideation all the way to operations once it is in the hands of customers
- Embrace our open-source culture and contribute to other projects that may not directly fall within your team’s scope
- Design, build, operate, and maintain critical systems, owning the reliability, performance, and availability
- Be a part of your team’s on-call rotations and take ownership of the services you’re running
- Mentor and support other team members, participate in design discussions and collaborate with the team. Drive towards decisions and embrace our culture of “don’t let perfect get in the way of great”.
- Learn new skills by gaining a deeper understanding of our cloud product and our customers and getting to know the codebase of a large distributed system
As we are remote-first and our engineering organization is largely remote, we provide guidance and meet regularly using video calls, so an independent attitude and good communication skills are a must.
What are we looking for in you?
- You are a motivated self starter with a bias towards action
- You are customer focused. We build everything with our users in mind.
- You have a passion for creating intuitive products that fit customers’ needs
- Pragmatism: You are able to take on complex challenges and break them down to achieve short feedback loops: to analyze, design, and build modular solutions, deliver MVPs, gather data and feedback and then progress iteratively
- Collaboration and communication: The smallest unit we have is a team. You’ll be working with your teammates in a fully remote setup. Good communication skills are a must
Requirements:
- Solid experience with at least one programming language. We use Go, but if you have familiarity with Python, C, C++, Rust or similar then that translates well
- Experienced with delivering projects from gathering requirements, brainstorming ideas all the way to shipping a product to the customer’s hands in a self-driven way
- Solid experience with developing software that runs in the Cloud or some experience with systems engineering
- Experience writing clean, robust, and performant software that is easily maintained by others
Nice to haves:
- Experience working with Kubernetes, or building and operating a storage service
- Been a user of Grafana and Prometheus in operational roles (including on-call for your team at a previous employer or just using these tools on hobby/homelab projects)
- Exposure to microservices architecture and distributed systems, or a desire to learn
- Familiarity with being on-call and performing operations/SRE tasks or with the concept of infrastructure as code
In the UK, the base compensation range for this role is £109,061 - £136,327. Actual compensation may vary based on level, experience, and skillset as assessed in the interview process. Benefits include equity, bonus (if applicable) and other benefits listed here.
*Compensation ranges are country specific. If you are applying for this role from a different location than listed above, your recruiter will discuss your specific market’s defined pay range & benefits at the beginning of the process
About Grafana Labs: There are more than 20M users of Grafana, the open source visualization tool, around the globe, monitoring everything from beehives to climate change in the Alps. The instantly recognizable dashboards have been spotted everywhere from a NASA launch and Minecraft HQ to Wimbledon and the Tour de France. Grafana Labs also helps more than 3,000 companies -- including Bloomberg, JPMorgan Chase, and eBay -- manage their observability strategies with the Grafana LGTM Stack, which can be run fully managed with Grafana Cloud or self-managed with the Grafana Enterprise Stack, both featuring scalable metrics (Grafana Mimir), logs (Grafana Loki), and traces (Grafana Tempo).