Voltron Data is an early-stage company creating high-performance data access and in-memory computing tools based on Apache Arrow to accelerate enterprise data analytics. We are a collection of open-source maintainers who have been driving open-source ecosystems over the last 15 years, particularly in the C++, Python, and R programming ecosystems.
 
We are assembling a global, diverse team to build a new foundation for data analytics with Apache Arrow. This foundation will usher in a wave of innovation in data processing that can take full advantage of the speed and efficiency offered by modern hardware.
We are looking for a highly motivated Senior C++ Storage Engineer to join Voltron Data’s team. On the team, you’ll have the opportunity to help support and grow the Voltron Data and Apache Arrow ecosystems. You will work closely with Voltron Data development teams to implement performant storage and I/O functions targeting a wide variety of networked, cloud, and local storage solutions.
 
What you will be working on:
Below is a rough timeline of where you can expect to be at different points during your career path starting in this position.

Upon joining:

    • Spending time learning about the Apache Arrow memory layout, compute primitives, and APIs
    • Familiarizing yourself with the different partners for compute kernels and the query execution engine on Apache Arrow
    • Learning and embracing the Apache development process

Within a month:

    • Implementing new high-performance storage and I/O primitives
    • Benchmarking existing I/O library functions to determine where there are bottlenecks
    • Discovering and implementing optimizations in data reads and writes
    • Participating in peer code review of all PRs related to file storage and interacting with different filesystems.
    • Contributing to technical discussions and technical design documents

Within 6 months:

    • Developing a comprehensive set of low level benchmarks for I/O functions targeting various local, networked and cloud storage technologies to enable monitoring for performance regressions
    • Ensuring that all filesystems interactions are compatible and performant across platforms (Linux, MacOS, and Windows)
    • Identifying and building reusable software components to ensure a high quality and maintainable codebase

Within 12 months:

    • Analyzing I/O throughput in a massively parallel and distributed query engine to identify inefficiencies and crafting solutions to tackle those inefficiencies
    • Ensuring that the everything related to storage is built as high quality as possible, balancing performance, usability, and maintainability across the Voltron Data and Apache Arrow ecosystems

Previous experience that could be helpful:

    • Strong experience developing in C++, especially using Modern C++
    • Experience developing and using various data lake storage technologies as: S3, Google Compute Storage, Azure Blob Storage
    • Building and using distributed networked file systems such as HDFS or Ceph
    • Experience working with technologies such as io_uring, DMA, RDMA, or GPUDirect Storage
    • Experience with different data storage file formats such as ORC, Parquet, and Avro
    • Experience with data lake table formats such as Iceberg, Delta Lake, and Hudi

Additonal Information:

For NYC-based applicants, the expected salary range is $140k to $165K + equity + benefits.

*Note: Disclosure as required by NYC Pay Transparency Law

Actual starting pay will be based on job-related factors, including exact work location, experience, training, and skill level, so may be higher or lower than what is shown on this posting.

Benefits

• Work from Anywhere - Payroll and Benefits in 150+ Countries
• Unlimited PTO
• Medical, Dental, and Vision
• Retirement [USA Only]
• Home Office Budget
• Continuing Education Budget
 
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

Apply for this Job

* Required
resume chosen  
(File types: pdf, doc, docx, txt, rtf)
cover_letter chosen  
(File types: pdf, doc, docx, txt, rtf)