6sense helps B2B marketing and sales organizations fully understand the complex ABM buyer journey. By combining intent signals from every channel with the industry’s most advanced AI predictive capabilities, it is finally possible to predict account demand and optimize demand generation in an ABM world. Equipped with the power of AI and the 6sense Demand Platform™, marketing and sales professionals can uncover, accelerate, and capture buyer demand to drive more revenue.
Senior Infrastructure Software Engineers at 6sense are true hybrid developers and operations engineers. They are responsible for ensuring our services and infrastructure are fast, stable, and scalable. They build out any services and tooling we need that are not readily available via third-party packages or services. They provide guidance on best practices to the overall Software Engineering team.
Operational tasks such as infrastructure, build/release, CI/CD, database administration, and systems administration also fall within their realm of responsibilities.
The Reliability team focuses on the automation, integration, operation, and overall improvement of our monitoring, logging, and alerting services to ensure we can deliver product quickly, safely, and reliably.
Develop and deploy services to improve the availability, ease of use/management, and visibility of 6sense systems
Building and scaling out our services and infrastructure
Learning and adopting technologies that may aide in solving our challenges
Own our monitoring, logging, and alerting tools used by the overall Software Engineering team in order to ensure we are meeting reliability requirements
Write/review/debug production code, develop documentation and capacity plans, and debug live production problems
Contributing back to open-source projects if we need to add or patch functionality
Support the overall Software Engineering team to resolve any issues they encounter
Help respond to service issues and determine how to automatically alert the responsible parties along with context in order to make the service-owner a self-sufficient first-responder
First-responder to issues with shared infrastructure and escalate to other team members as necessary
Write configurations and scripts to pull data into our monitoring/logging/alerting systems
Work with other teams to get automatic resolutions in place to alleviate need for human response
Participate in on-call rotations to monitor platform/infrastructure issues
4+ years in a Software Engineering role or equivalent experience
4+ years in a reliability-type role (such as Site Reliability Engineering)
4+ years of experience with Linux/Unix system administration and networking fundamentals
Strong coding fundamentals and good code-reading skills
Good knowledge of Python and Java
Experience monitoring and analyzing services/applications in service-oriented architecture at the network/server-level as well as in containerized space (such as Kubernetes and Docker)
Experience with high-availability
Experience with leveraging and configuring monitoring systems such as Datadog, Grafana, Grafana Loki, Promethus, Sumo Logic, PagerDuty
Knowledge of the Hadoop ecosystem (e.g. Hadoop, Hive, Presto) including deployment, scaling, and maintenance