About FutureHouse

We are a new philanthropically-funded moonshot focused on building an AI Scientist. Our 10-year mission is to build semi-autonomous AIs that can scale scientific research, to accelerate the pace of discovery and to provide world-wide access to cutting-edge scientific, medical, and engineering expertise. At FutureHouse, AI researchers and wet lab biology researchers work together on building the future of science.

The Assessment Team will be responsible for establishing FutureHouse as the world leader for evaluating the scientific abilities of AI systems, particularly for biology research. The goal of the Assessment Team will be to monitor the capabilities of the AI systems we are building and to tell us how far we are on the path to an AI Scientist. This work is particularly important because having robust methods to evaluate performance is essential to make progress, for example. It is also important because these methods are how we will evaluate and develop mitigations for the risks associated with autonomous or semi-autonomous AI Scientists.

Many aspects of scientific reasoning (like inference, hypothesis generation, etc.) are extremely difficult to assess. State-of-the-art benchmarks today are mostly question-answering benchmarks, which are usually tests of knowledge, rather than reasoning. To get at the core cognitive components that make humans good at science, the Assessment Team will need to develop fundamentally new methods. We are well-positioned to do this because, unlike most AI organizations, we have real, practicing biology expertise in-house, and a wet lab for assessment methods that require lab validation.

The team will be diverse, consisting both of biology researchers with hands-on, practical wet-lab experience, and AI researchers with backgrounds in alignment, benchmarks, evals, and assessments. We are particularly excited about candidates who have experience spanning both biology and AI.

This position is full-time in-person in San Francisco.

Core Responsibilities:

  • Conducting foundational research into assessment methods. The Assessment Team will develop new methods for evaluating the scientific reasoning capabilities of AI systems, such as generating hypotheses and interpreting data.
  • Developing specific benchmarks and assessment procedures. The Assessment Team will develop benchmarks and assessment procedures to measure specific aspects of a model or agent’s behavior that are pertinent to biology research.
  • Deploying benchmarks and assessment procedures. The Assessment Team will scale up its benchmarks and assessment procedures and apply them both internally to accelerate progress and evaluate risks, and externally to evaluate the systems being built by our partners. The Assessment Team will also make its work publicly available to the greatest extent possible, so that its methods and benchmarks can benefit the entire community.

Position Requirements:

  • Independent research experience and a track record of impressive outputs in wet lab biology research, computational biology research, or AI research.
  • For candidates who only have experience in AI research, a track record of benchmark or eval development is preferable.
  • Experience in managing large datasets, utilizing benchmarking tools, and employing various data visualization techniques.
  • Excellent verbal and written communication skills to present findings, articulate insights, and collaborate with cross-functional teams effectively.
  • Ability to organize and execute benchmarking projects, set timelines, and manage resources efficiently.

What can you expect at FutureHouse?

  • We are pioneering a novel approach to scientific research balancing large-scale efforts by tightly focused teams to produce public goods, with open-ended scope, no set time limit, and multiple projects under one roof.
  • We are fiercely committed to a flat structure, team science, and individual contributions: we believe that the way to enable discoveries is to enable small, integrated teams of outstanding biologists and AI researchers to iterate rapidly towards ‘big-if-true’ ideas.
  • We are independent -- not affiliated with any university -- because our independence allows us to flexibly hire the best talent in the world and to enable that talent to pursue their best ideas without the distractions of teaching, admin, or grant writing.
  • A work environment of connected autonomy, to pursue your novel ideas while influencing the direction and priorities of the organization.
  • A tight-knit and collaborative academic-influenced culture, eager to read and discuss the latest papers, host tech talks, and participate in the broader research community.

What we offer

  • Competitive compensation
  • Great medical benefits at low or no cost to you
  • Flexible Time Away so you have the time you need, available right away
  • Exciting new headquarters in SF in the Dogpatch neighborhood
  • The ability to join an ambitious start-up on the ground floor

Apply for this Job

* Required
resume chosen  
(File types: pdf, doc, docx, txt, rtf)
cover_letter chosen  
(File types: pdf, doc, docx, txt, rtf)


Our system has flagged this application as potentially being associated with bot traffic. Please turn off any VPNs, clear your browser cache and cookies, or try submitting your application in a different browser. If this issue persists, please reach out to our support team via our help center.
Please complete the reCAPTCHA above.