Fellowship on AI for Human Reasoning
Fellowship on AI for Human Reasoning
Apply by June 9th | $25k–$50k stipend | 12 weeks, from July 14 - October 3
Join us in working out how to build a future which robustly empowers humans and improves decision-making.
FLF’s incubator fellowship on AI for human reasoning will help talented researchers and builders start working on AI tools for coordination and epistemics. Participants will scope out and work on pilot projects in this area, with discussion and guidance from experts working in related fields. FLF will provide fellows with a $25k–$50k stipend, the opportunity to work in a shared office in the SF Bay Area, and other support.
In some cases we would be excited to provide support beyond the end of the fellowship period, or help you in launching a new organization.
Why this area? Why this fellowship?
Technology shapes the world we live in today. The technologies we develop now — especially AI-powered technologies — will shape the world of tomorrow.
We are concerned that humanity may fumble this ball. High stakes and rapid, dynamic, changes mean that leaders and other decision-makers may be disoriented, misunderstand the situation, or fail to coordinate on necessary actions — and steer the world into gradual or acute catastrophe.
The right technology could help. The rise of modern AI systems unlocks prospects for big improvements here; AI tools might help everyone track the state of play, make decisions they stand behind, and act in sync with others. This could defuse race dynamics, prevent narrowly-interested cliques from exploiting less coordinated groups, and avoid catastrophic errors of judgement.
We believe:
The world is radically underinvested in these beneficial applications;
Many people have not yet had the space to take these prospects seriously.
This situation calls for ambitious and creative efforts. Our fellowship — somewhere between a research fellowship and a startup incubator — is designed to address this. Fellows will explore and discuss potential beneficial applications, and then build roadmaps and prototypes. We will empower them by providing resources, advice, and connections.
At the end of the fellowship, we will invite people to present their work to potential funders (including us) and others working in this space. We hope that the fellowship gives space for seeds to germinate — and in some cases then grow into new organizations pursuing pivotal new capabilities. With sufficiently good tools, we might steer away from a world in which humans are increasingly irrelevant, towards one with deep institutional competence and individual empowerment.
Who are we looking for?
We are looking for talented people who might help found new projects in AI for human reasoning. We think that AI progress is sufficiently fast that many people should be rethinking their plans and exploring building tech that differentially empowers humans.
There’s no specific background we’re hoping to find in applicants’ CVs. We're especially interested in candidates who bring a depth of relevant experience, and have a history of successfully executing on projects, whether in research or industry. We’d love to get your application if you:
Want to use your time to help humanity navigate transformative AI
Are happy thinking about fuzzy, complicated topics
Have a “doer” mentality; you figure out what needs to happen and make sure it does
Have technical, entrepreneurial, or domain-specific skills that could push this area forward, for instance one or more of:
A background in ML
Experience in HCI research, or building useful tools for thought
Experience working on engineering or product teams
Experience as a founder or early employee in a startup
An aptitude for strategy research or distillation
If this fellowship speaks to you but you aren’t sure if you’re qualified, please err on the side of applying, and feel free to reach out with any questions. Our goal is to bring together a group of exceptional people with a wide variety of skills; we’d really prefer if people don’t rule themselves out (and we may offer related opportunities in the future).
Application process & timeline
Apply by the end of June 9th, 2025, anywhere in the world. (But applications by June 2nd are appreciated!)
You’ll need to submit a CV and answer a few questions. We want to understand a bit about you and the shape of your thinking. Feel free to use LLMs if they’re helpful for articulating things, but be warned that pure LLM writing is unlikely to stand out positively. Applicants whose submissions pass an initial screen will then be invited to an interview.
We plan to conclude interviews by June 18th, and make offers by June 20th. (This is a tight timeline, so it is possible it will slip; apologies in advance if so.)
The fellowship will run full-time, over 12 weeks from July 14 – October 3. It will open with an in-person workshop and build up to a “demo day” in the final week for fellows to present their work (including to organizations or funders who may wish to support these projects). Fellows will be paid a stipend of $25,000–$50,000 (depending on location and experience), and have access to a compute budget of $5,000.
If this sounds great but the dates don’t work, we’d still love to hear from you! There may be opportunities to be involved in the networking events, seminars, and demo sessions throughout the program. And we may run a second cohort (or admit off-cohort fellows) later in the year.
We will hold office hours on the fellowship and the application process May 29th, 9am PST / 5pm GMT. You can email us with questions at fellowship@flf.org.
But seriously — what will people be working on?
The core activities we envision include:
Roadmapping — taking a potential technological application, and exploring:
What the implications would be (hence how desirable it would be)
What might be necessary from a technical standpoint
What might be necessary to drive societal adoption
What viable pathways for pursuing this technology could be
What the key uncertainties are
Prototyping — building and testing things to discover where they can already be useful and what else is needed:
We think there is a lot to be said for a living lab philosophy: "co-creation, exploration, experimentation and evaluation of innovative ideas, scenarios, concepts and related technological artifacts in real life use cases"
Similarly ‘going for the throat’ and trying to actually solve the problem will more quickly reveal what the challenges and bottlenecks really are.
Here is a non-exhaustive list of the kind of technology people might be building towards:
-
Motivation: Many people already have a hard time keeping up with relevant information. As the pace of change accelerates, this issue could worsen. By keeping people broadly better empowered to stay informed, we could help people to make good collective decisions.
Possible tools include:
Interfaces for helping to fact-check / evaluate / trace epistemic provenance of claims
Community notes for everything — extending this epistemic tech beyond X
Tools for smart interpretations of public forecasts — understand the context, exactly what has fed into them, what the track records of the speakers look like
Always-on rhetoric highlighting, which helps readers notice when e.g. facts are juxtaposed but the implicature hasn’t been well-supported
-
Motivation: People having accurate pictures of the future (or conditional forecasts!) could be important for enabling them to handle the challenges of advanced AI well
AI tools that augment different parts of the forecasting stack, e.g.
Elicit the implicit models of human forecasters
Produce reference classes
Help formulate and refine crisp questions
Explore and generate scenarios that aid in conditional forecasting.
This could enable human forecasters to work better and more efficiently, and generate more comprehensive forecasts
If later incorporated into a fully automated AI forecasting system, it could help to keep key reasoning processes and inputs human-inspectable
-
Motivation: Approximately nobody wants catastrophe. But reaching agreements can be difficult. Tools for better negotiation could facilitate safety-enhancing agreements between nation states or AI corporations, helping to avert race dynamics.
Systems could work as delegates negotiating on behalf of each of multiple parties
AI delegates could provide a large multiplier to negotiation bandwidth, and facilitate proper exploration of the potential solution space
This may be especially crucial in multi-party negotiations
Systems could also function as mediators, helping to uncover positive-sum solutions
Compared to human mediators, these might more easily achieve a standard of being highly trusted to be neutral, and might be entrusted with sensitive information that actors do not wish to disclose directly to their counterparties
We think that developing good tools here likely requires a technological stack of test environments etc., and that it might be high leverage to create the infrastructure that facilitates the development of more direct applications
-
Motivation: Helping people to make wise decisions could be valuable in many contexts — including in helping to avoid major screw-ups and risks of catastrophe
Some possible tools include:
Endorsed preference elicitation tool
Something which can dialogue with people to help them identify what they deeply prefer and endorse
Tools for helping decision-making avoid emotional traps
Our decisions are often clouded by emotion. Wise advisors can help us to see through this and work out what we truly believe to be right. Could AI tools play a part of this role?
Often perceived enemies on instrumental grounds could be allies if only people’s understanding of the situation were richer
Decision sanity-checker
If catastrophes often involve someone making egregiously bad decisions (knowingly or otherwise), might there be a role for LLMs to help catch these decisions before they are finalized? What information and infrastructure would be needed?
-
Motivation: A world where “epistemically virtuous” AI systems (those that are deeply cooperative with their audience) are the norm might:
Enable more justified trust in certain AI systems
Facilitate trust between people, and give people an easier time coordinating around difficult challenges
Statements which are epistemically virtuous are generally straightforward, clear, not misleading or sycophantic, make their own sources clear, etc. …
It could be valuable to develop benchmarks or evaluations for epistemic virtue
LLMs today do not always do well on these dimensions of virtue
The field of AI is often good at achieving what can be measured!
So if there were good measurements, these could be used to train subsequent systems, and to help people understand which AI systems are more or less deeply trustworthy
Creating benchmarks or evaluations would involve a mix of assessing what is actually important, and designing and implementing practical ways of measuring that, then exploring how the measurements may fail to capture the key things
-
Motivation: Collective action problems are difficult, and this has the effect of disempowering larger groups relative to focused organizations. In many cases, we think that having the interests of the larger groups relatively empowered would be stabilizing. (e.g. AI researchers relative to the heads of the companies, or the public relative to politicians)
Automated consensus-finding systems could be used to help identify and refine targets for collective action
Structured transparency tooling which takes in rich information, run that through some system of LLMs + prompts, and produce an output with some conclusions, in a way that is verifiably privacy-preserving for important details of the input data
Platforms supporting contractual enforcement and adherence to mutual obligations could enable much stabler cooperation between larger groups.
We would be delighted if projects starting here develop into new major initiatives, and we will provide support towards the end of the fellowship in thinking through how that might happen.
FAQs -philosophical
-
It’s a fuzzy term, and to some extent may be best defined by examples (as described above). But here’s an attempt to give a more explicit definition:
There is a broad and very high-dimensional space of potential AI capabilities. Some of these capabilities will (if properly adopted by society) improve the quality of human understanding and decision-making, by more than they make the world more complex and difficult to understand. This is what we call AI for human reasoning (see image)
Recognizing when something is in the relevant sector may not always be straightforward; in some cases it may be important to discuss whether something really counts.
In particular, “better” decisions by the lights of the decision-maker aren’t universally good by the standards of global decision-making. People might use a better understanding of what will be persuasive to mislead others; or criminal conspiracies might better coordinate to avoid defection. It’s worth thinking through the implications of particular applications; on the whole, though, we think things that boost human decisions will tend to be differentially positive, especially when there is broad access to these tools.In many cases we expect such applications to take inspiration from the ways in which humans already reason and coordinate:
Gathering and synthesizing observations and opinions
Exploring areas of agreement and disagreement, and surfacing opportunities for shared benefits
Forecasting and predicting the outcome of policies and courses of action
Deliberating and debating the merits of outcomes and objectives
Coordinating and enforcing contractual and other interactive obligations
-
Some particular leverage with which contemporary and near-future AI and other tech could support human reasoning activities include:
Unprecedented scale and speed: of information gathering, communication, and synthesis
New avenues for diverse world modelling and simulation
Cryptographic guarantees and signatures
Newly expanded niches for mediation, verification, escrow, and similar intermediary functions
-
We think that as AI and other technological capabilities grow, the world may get more complex, and it is far from guaranteed that humanity will navigate the period wisely. Making sure that the capacity for good decision-making outstrips the complexity and difficulty of decisions people face could be crucial.
More pointedly, we are concerned about existential risk. We think that worlds where things go terribly wrong generally involve a lot of people making mistakes even by their own lights, and we think that proper tools could help to avoid those. More broadly, we see this as potentially a key ingredient for helping to navigate critical AI transitions wisely.
-
We think human reasoning uplift is robustly beneficial in a wide range of scenarios.
If you expect a very acute and decisive AI takeoff, it really matters how informed are the developers, how restrained the surrounding research efforts of humanity, and how well represented the honest priorities and interests of the broader public. If you expect very diffused and multipolar AI takeoff, you should also expect a period of dynamic, confusing, and disorienting progress, as well as default pressure toward competition and disruption of existing norms and institutions. Just the occasion for improved human reasoning!
There are strategic differences in how one should approach these diverse scenarios, but we think there are far more synergies than conflicts in the tooling and applications we might want to have available. And we are fortunate to be working at a time when new technical possibilities are opening up for developers to bring those into existence.
(If you think AI won’t be fairly transformative in the next few decades, you might still think there are useful things to be done here, but we imagine the area looks quite a fair bit worse.)
-
Automation isn’t automatic! It matters which data are available to AI systems, how they are initially integrated with tooling and actuators, what problems they’re directed at first, and how the sequencing of adoption and other non-technical aspects plays out.
Importantly, any and all of these applications for human reasoning uplift could matter substantially before comprehensive automation of research and development become possible. They could also help to ensure beneficial direction and meaningful oversight even as and when such automation comes online.
FAQs — practical
-
Mentors include experts at FLF and other organizations doing relevant work. Mentors include: Anthony Aguirre, Andreas Stuhlmüller, Deger Turan, Eli Lifland, Jay Baxter, Josh Rosenberg, Julian Michael, Michiel Bakker, Owen Cotton-Barratt, Oly Sourbut, and Ben Goldhaber.
Expect this list to grow quickly — we are confirming which of our mentors would like to be listed publicly.
We stress that mentors are there to offer advice, not dictate direction (unless you end up working in close collaboration on one of the mentor’s own projects).
-
No! For those who can be together in person, we think there are benefits, but except for a 3-day workshop at the start, we want to facilitate remote work. (If there are multiple attendees based in another city, we may try to help organize co-working.)
-
Please note any constraints on your application form. We have a preference for people who can join full-time for at least 11 out of the 12 weeks, but we can be flexible for the right applicants.
And if you’re interested in this work but don’t think you can make the fellowship, we’d still love to hear from you. We may be able to involve you in some of our programming and may run additional rounds of this fellowship.
-
Sorry, not at this time.
However, this should not preclude people from participating in the fellowship! Fellows can work from anywhere in the world.
While we would like all participants to come to the kickoff weekend, we understand that not everyone we accept will be able to. -
Yes! For early-stage projects, we’d be happy to have you develop them further within the fellowship (although we hope that you will retain an open mind about whether pivoting could be better).
-
Yes! If you have a small team of collaborators, please each submit individual applications, and let us know on your application form.
-
Perhaps! The fellowship is designed as a standalone program, without a default expectation of further funding. That said, we are excited to consider or in some cases co-develop proposals for high-leverage nonprofits operating in this space. We are still forming our views about promising directions and what is needed.
-
We make (non-negotiable) offers based on experience and location. For the 12-week fellowship, these will range between $25k and $50k. If we make you an offer for a different period, it will be pro-rated accordingly.
-
Each fellow has $5,000, which is designed to be plenty to cover initial experiments. But if you have a direction which you think is promising and would benefit from larger expenditure, talk to us and (if we agree!) we may be able to provide an increased budget.
-
Not necessarily. FLF is a nonprofit and our purpose is to support work which is for the public benefit. However, in some cases a beneficial tool may make more sense as a commercial product. While we cannot directly support for-profit work we could help you to think through in what circumstances this might make sense.
On occasion, we may be interested in extending the runway of individual fellows — where we think their exploration work is especially promising, but that they just haven’t hit on the right thing to scale up yet.