Tally makes people less stressed and better off financially. We've built the first fully automated debt manager to help people overcome credit card debt. Currently at Series C with $92MM in funding and backing from top investors including Andreessen Horowitz and Kleiner Perkins, we are a team that is democratizing financial services to put billions of dollars back in people’s pockets. Tally’s vision is to automate people’s entire financial lives so they can worry about money less and do what they love more.
Are you driven to make a real-world impact while leveraging public cloud infrastructure, Kubernetes, and microservices? Do you love writing code to build efficient automated tools and infrastructure to save engineer’s time? If you answered yes to these questions, we would love to talk to you. We are looking for a Senior Site Reliability Engineer to join our SRE team to help us accelerate our Tally engineers’ ability to automate people’s entire financial lives.
At Tally, the Site Reliability Engineering team builds developer tools and manages cloud and IT infrastructure and systems that enable Engineering and Product teams to ship products quickly, securely, and with quality. Our team empowers Tally engineers to become more effective and productive through automation, tooling, and IT systems. Please come collaborate with us to make our teams even more effective at helping people achieve their financial goals.
Core Technologies: Kubernetes, ELK, Prometheus, Postgres RDS, Redis, Puppet, Terraform, Scala, SBT, CI/CD (Jenkins, Bamboo, Gitlab) and various AWS services such as EC2, RDS, EMR, ECS, Redshift, etc.
What you'll do:
- Design, build and deploy infrastructure systems for managing our public cloud environment using containers, microservices, and next-generation orchestration and monitoring tools and technologies.
- Design, develop and maintain robust CI/CD tools and pipelines
- Architect Tally’s build, deploy, and release systems to be more reliable, faster, secure, and optimized in our public cloud.
- Build self-service capabilities to empower other engineering teams to self-provision, troubleshoot, and manage various application environments
What you'll bring:
- Expert knowledge in at least one server-side programming languages (e.g. Scala, Java, Go, Python)
- Experience with container lifecycle, orchestration, and monitoring tools and processes (Kubernetes, ECS, Prometheus) and microservices architecture
- Proficiency with CI/CD tools (e.g. Jenkins, Bamboo, Gitlab, Spinnaker, Harness, CircleCI, etc)
- Experience with distributed streaming, messages systems, and datastores (Kafka, EMR, Postgres, DynamoDB)
- A passion for promoting a DevOps culture, with a specific focus on process automation (e.g. Continuous X -- integration, delivery, deployment, verification)
- Ability to thrive in a startup/fast-paced environment
- Prior experience managing Akka cluster a plus