Job Application for Member of Technical Staff - Assessment Team at FutureHouse, Inc.

About FutureHouse

FutureHouse is a philanthropically-funded moonshot focused on building an AI Scientist. Our 10-year mission is to build semi-autonomous AIs that can scale scientific research, to accelerate the pace of discovery and to provide world-wide access to cutting-edge scientific, medical, and engineering expertise. At Future House, we're not just envisioning the future; we're building it.

The Assessment Team will be responsible for establishing FutureHouse as the world leader for evaluating the scientific abilities of AI systems, particularly for biology research. The goal of the Assessment Team will be to monitor the capabilities of the AI systems we are building and to tell us how far we are on the path to an AI Scientist. This work is particularly important because having robust methods to evaluate performance is essential to make progress, for example. It is also important because these methods are how we will evaluate and develop mitigations for the risks associated with autonomous or semi-autonomous AI Scientists.

Many aspects of scientific reasoning (like inference, hypothesis generation, etc.) are extremely difficult to assess. State-of-the-art benchmarks today are mostly question-answering benchmarks, which are usually tests of knowledge, rather than reasoning. To get at the core cognitive components that make humans good at science, the Assessment Team will need to develop fundamentally new methods. We are well-positioned to do this because, unlike most AI organizations, we have real, practicing biology expertise in-house, and a wet lab for assessment methods that require lab validation.

The team will be diverse, consisting both of biology researchers with hands-on, practical wet-lab experience, and AI researchers with backgrounds in alignment, benchmarks, evals, and assessments. We are particularly excited about candidates who have experience spanning both biology and AI.

This position is full-time in-person in San Francisco.

Core Responsibilities:

Conducting foundational research into assessment methods. The Assessment Team will develop new methods for evaluating the scientific reasoning capabilities of AI systems, such as generating hypotheses and interpreting data.
Developing specific benchmarks and assessment procedures. The Assessment Team will develop benchmarks and assessment procedures to measure specific aspects of a model or agent’s behavior that are pertinent to biology research.
Deploying benchmarks and assessment procedures. The Assessment Team will scale up its benchmarks and assessment procedures and apply them both internally to accelerate progress and evaluate risks, and externally to evaluate the systems being built by our partners. The Assessment Team will also make its work publicly available to the greatest extent possible, so that its methods and benchmarks can benefit the entire community.

Position Requirements:

Independent research experience and a track record of impressive outputs in wet lab biology research, computational biology research, or AI research.
For candidates who only have experience in AI research, a track record of benchmark or eval development is preferable.
Experience in managing large datasets, utilizing benchmarking tools, and employing various data visualization techniques.
Excellent verbal and written communication skills to present findings, articulate insights, and collaborate with cross-functional teams effectively.
Ability to organize and execute benchmarking projects, set timelines, and manage resources efficiently.

What we offer:

Everyone who works here is insanely smart.
We are working on problems that are way more interesting than what you'll find anywhere else. Why sell ads or optimize buttons when you can solve science and make breakthrough discoveries in basic research?
We have a ton of resources, including wet lab, compute, etc.
We publish our work.
Our headquarters in Dogpatch (SF) is super cool.

And also the normal HR stuff:

Annual salary range: $100K - $500K
401(k) plan with company matching
Great medical benefits at low or no cost to you
Flexible time off
Visa sponsorship and relocation stipend to bring you to SF, if possible
The ability to join an ambitious start-up on the ground floor

What can you expect from FutureHouse?

We are tight-knit and collaborative research culture.
We are fiercely committed to a flat structure, team science, and individual contributions: we believe that the way to enable discoveries is to enable small, integrated teams of outstanding biologists and AI researchers to iterate rapidly towards ‘big-if-true’ ideas.
We work to flexibly hire the best talent in the world and to enable that talent to pursue their best ideas without the normal distractions, like product engineering, grant writing, etc.

Member of Technical Staff - Assessment Team

About FutureHouse

Apply for this job