We're looking for a Data Engineer to help grow our data processing pipeline! We perform billions of network handshakes and DNS lookups per hour as well as consume external data feeds to maintain an up-to-date view of all hosts and networks on the Internet You will help build and maintain the processing pipeline that consumes inbound data feeds to produce a consistent view of Internet hosts. We leverage the Google Cloud Platform (including Google Dataflow, Bigtable, and BigQuery) for processing data as well as build our own analysis tools. Your responsibilities will include exploring new ways of processing and analyzing incoming network data, and building out our data processing pipeline.
The types of things you’ll do:
Work with Apache Beam, Airflow, Google Dataflow, BigTable, and BigQuery to build the next generation of the Censys data processing pipeline
Design automated solutions for building, testing, monitoring, and deploying ETL pipelines in a continuous integration environment
Work with application engineers to develop internal APIs and data solutions to power Censys product offerings
Coordinate with backend engineering team to analyze data in order to improve the quality and consistency of our data
Bachelor's degree in Computer Science or related field, or equivalent experience
3+ years of full-time, industry experience
Deep understanding of relational as well as NoSQL data stores (e.g., Snowflake, Redshift, BigTable, Spark) and approaches
Hands-on experience building data processing pipelines (e.g, in Storm, Beam)
Proficiency with object-oriented and/or functional languages (e.g. Java, Scala, Go)
Strong scripting ability in Python/Ruby/BASH
We are headquartered in Ann Arbor, Michigan, but open to this position being located 100% remote.
We celebrate diversity and are committed to creating an inclusive environment for all employees. Censys is an equal opportunity employer.