Voltron Data is an early-stage company creating high-performance data access and in-memory computing tools based on Apache Arrow to accelerate enterprise data analytics. We are a collection of open-source maintainers who have been driving open-source ecosystems over the last 15 years, particularly in the C++, Python, and R programming ecosystems.
We are assembling a global, diverse team to build a new foundation for data analytics with Apache Arrow. This foundation will usher in a wave of innovation in data processing that can take full advantage of the speed and efficiency offered by modern hardware.
We are looking for a highly motivated Senior Query Optimization Engineer to join Voltron Data’s team. On the team, you’ll have the opportunity to help support and grow the Voltron Data and Apache Arrow ecosystems. You will work closely with Voltron Data development teams to build and maintain a SQL parser and query optimizer for large scale single node and distributed query execution engines.
What you will be working on:
Below is a rough timeline of where you can expect to be at different points during your career path starting in this position.

Upon joining:

    • Spending time learning about the Apache Arrow compute primitives, compute intermediate representation, compute engine, and other foundational components.
    • Familiarizing yourself with the different partners for compute kernels and the query execution engine on Apache Arrow.
    • Learning and embracing the Apache development process.

Within a month:

    • Becoming familiar with our SQL parser and query optimizer.
    • Benchmarking queries and exploring the effects of different query optimization techniques using our query optimizer.
    • Making changes and improvements to the existing query optimization rules and how it creates physical execution plans for the execution engines under development.

Within 6 months:

    • Adding new query optimization rules.
    • Making improvements to decision making in the cost based optimization based on metadata availability for the tables being queried.
    • Integrating non-SQL operations to the optimization framework.
    • Working with client interfacing engineers to understand performance bottlenecks in customer queries.

Within 12 months:

    • Proposing and implementing improvements to the query parsing and optimization framework.
    • Integrating with a stateful inter-query query engine context to optimize the reusability of compute stages across queries that query the same data in similar ways.

Previous experience that could be helpful:

    • Building and/or using open source query optimization frameworks like Apache Calcite, Apache Spark Catalyst, Postgres Query Optimizer, and/or others
    • Developing in C++, especially using modern C++
    • Utilizing serialization libraries like FlatBuffers, Protobuf, Thrift, MessagePack, and/or others
    • Working on non-SQL systems and non-SQL computational abstractions

Additional Information:

For NYC-based applicants, the expected salary range is $175k to $210K + equity + benefits.

*Note: Disclosure as required by NYC Pay Transparency Law

Actual starting pay will be based on job-related factors, including exact work location, experience, training, and skill level, so may be higher or lower than what is shown on this posting.


• Work from Anywhere - Payroll and Benefits in 150+ Countries
• Unlimited PTO
• Medical, Dental, and Vision
• Retirement [USA Only]
• Home Office Budget
• Continuing Education Budget
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

Apply for this Job

* Required
resume chosen  
(File types: pdf, doc, docx, txt, rtf)
cover_letter chosen  
(File types: pdf, doc, docx, txt, rtf)