Epistemic Virtue Evaluations
In AI, you get what you measure:
- We need AI systems to be working for users, not acting sycophantically or subtly selling them things.
- There are immense commercial pressures on the companies producing AI, and they don’t all incentivize honest and transparent behavior.
- We need independent third-party evaluations that let us tell which systems most help people believe true things for the right reasons — and which just appear to.
AI systems should empower people to think critically and provide complete, accurate results, not just give comfortable, reassuring answers.
We want to see an ecosystem where:
- We have great theoretical work identifying what it actually means for AI systems to genuinely embody — and enable — good, helpful reasoning
- There are public benchmarks and privately withheld evals that all major AI products are assessed on
- Journalists cover these evals and hold companies accountable when they fall short