AI

A General Language Assistant as a Laboratory for Alignment

The paper my team put out just after I arrived, laying out the work we are doing to train large language models on human feedback, to make a helpful, honest, harmless AI assistant.

Philosophy

Strange Experience: Why Experience Without Access Makes No Sense

I introduce a challenge to the view that thinking about minds in a first-personal, how-it-feels way is cleanly separable from thinking about minds in a third-personal, how-it-works way. I focus on a set of thought experiments involving phenomenology without function, the ‘converse’ of widely-discussed zombie cases.

The Mental Measurement Problem: The Frictionless Epistemology of Conceptual Dualism

I show how a conceptual gap between first-person data and third-person data leads to insurmountable methodological difficulties for a science of mind.

Making Room for Modest Conceptual Functionalism

Stepping back from first-order arguments about the conceptual connection between the phenomenal and the functional, I map out the logical space of views and demonstrate how the familiar difficulties of ‘immodest’ conceptual functionalism don’t carry over to a ‘modest’ version of the view.

Dissonant Qualia: Why Phenomenal Kinds Must Have Matching Functional Kinds

Dissonant qualia cases ask: are structure-distorting phenomenal quality transformations conceptually compatible with the same underlying functional architecture? At least some of them aren’t, and that shows that some phenomenal kinds (e.g. color experience, emotion, audition) conceptually necesitate matching functional kinds. (c.f. the structure-preserving transformations of inverted qualia cases.)

Dissertation

Philosophers regularly commit themselves to a conceptual distinction between phenomenal experience and functional structure. As the thought goes, you can’t learn anything about an organism’s cognitive architecture just by learning about whether/what that organism feels, and vice versa.

But this is a mistake. It generates a series of unsatisfying, intractable debates, and creates insurmountable methodological difficulties for a science of consciousness. We can’t make sense of experiences as experiences unless they meet certain functional constraints. This much can be demonstrated by considering test cases that keep the phenomenal facts fixed while zeroing out or altering the functional facts. Such cases are a kind of inversion of the classic absent qualia and inverted qualia thought experiments. The result is a modest form of conceptual functionalism that reconciles competing dualist and physicalist intuitions.