Instructed activity

Guy assembling a bookshelf
Image courtesy Kaptain Kobold

“Instructed activity” means taking instructions into account while doing stuff.

Someone is watching over your shoulder and giving advice while you play a video game; or you are driving with a GPS app giving you directions; or you are checking the instruction sheet while assembling an IKEA bookshelf; or you are building a lamp from scratch, and you figured out the main steps ahead of time, and you are now carrying out your plan—a series of instructions you made for yourself.

Making sense of the instructions, and making use of them, may involve all the sorts of work we’ve covered earlier in Part Two. So this final chapter of the Part draws together all of its themes.

This chapter also serves as a bridge into Part Three. Performing a formal procedure, such as factoring a polynomial, is also following instructions. Understanding how that is similar to following street directions, and how it is different, is central to understanding the relationship between reasonableness and rationality.

This chapter recapitulates artificial intelligence research from my PhD thesis, which I quote from below. I wrote a program, Sonja, that took instructions while playing a video game; it illustrated most of the points covered here.1

Recipes are not programs

Instructions never completely specify what to do. A recipe says “lay the slices of eggplant on paper towels and sprinkle lightly with salt”; it does not say how you pick up each slice and where to position it on the paper towel; or what to do if you drop one; or whether you scatter the salt from a shaker or from between your fingers.

This is obvious, but bears emphasizing due to a potential rationalist misunderstanding. Often programming is introduced as “giving the computer instructions,” and programs are likened to recipes. This is probably helpful for novices, but potentially misleading in that a program does have to specify what will be done in complete detail. Then the computer does exactly and only what the program says.

This understanding of program execution is then sometimes read back onto human activity. One rationalist theory of action is that you write “plans” for yourself, where “plan” means a program, and then execute them.2 For many different reasons, this can’t work; one is the impossibility of foreseeing all the details of how irregular slices of eggplant will flex or slip, and so exactly how you will have to grip them. Reality is complicated, so you often have to wait to figure out how to cross a river until you get there.

Interpreting instructions in context

“Following” instructions suggests that you just do what they say. You can’t, since they can’t say everything. Also, first you have to figure out what they mean, in terms of the specifics of what you are doing. Reasonable interpretation, relying on context and purpose, is required. As explained earlier, “interpretation” does not usually require explicit reasoning, but it does always take some work, and is sometimes difficult. To make instructions useful, you need to see how they are relevant. If we are playing a collaborative video game and I yell “use a knife!”, you need to figure out what to use it for.

Maybe the point of “use a knife!” is to hurl it at a demon toad; maybe it is to jimmy a lock. Context determines what actions are possible; and purposes—what we are jointly trying to accomplish right now—determine which are relevant. What must my instruction be trying to say, given those? Well, fiddling with the lock while under attack by demons would be dumb.

So it is more accurate to say that you take instructions into account in the course of reasonable activity than that you “follow” them. Instructions are one resource among many for making sense of what’s going on.

Looking ahead to Part Three, a key difference between reasonable instruction use and performing a rational procedure is that in rationality you must not take circumstances or purposes into account. Context-stripping is much of what gives rationality its distinctive power.

Instructed looking

To take an instruction into account, most often you have to look to see what it is talking about. You use task-specific visual routines for that.

An instruction usually includes referring expressions. “Add two eggs”: you will have to go to the refrigerator, look around inside to find the egg carton, carry it to the counter, open it, and look to see how to grasp the eggs as you pick them up and crack them open.

Often an instruction explicitly tells you what to look at, or for, or how. In a video game: “Look out! Specter on the left! In the shadow!”

Locating the bits referred to in the instructions to flat-pack furniture is notoriously difficult.3 “Attach rail (C) to front legs (D) with four bolts (J).” There’s a mess of miscellaneous bits on the floor; which are the bolts? Oh, these. No, wait, there’s long bolts and short ones… Are these type J? You page back through the instructions to the parts list, hold a bolt up next to the picture, and look back and forth between the two.

What counts as doing that?

Instructions rely on a background of meanings shared between the instruction giver and taker. A recipe has to assume that you already know how to cook, and when it says “add two eggs” it doesn’t need to explicitly direct you to remove their shells first. For instructions to be interpretable, the shared meaning must be nearly complete; instructions concern only the tiny bit the user may be missing.

This might be misunderstood, in the action-as-program-execution model, in terms of instructions functioning as high-level function calls. You would already have a procedure for adding eggs; you’d just need to know when to call it (with the argument 2). But you don’t have a procedure for adding eggs; you have to improvise it every time, because real-world circumstances are never exactly the same.

In my video corpus, there are many instances of someone being told “Turn left!” when doing so at that moment would run him into a wall. The player turns left when he gets to the upcoming junction the instruction implicitly refers to. But it is not merely a syntactic ellipsis for “Turn left at the junction coming up.” I have examples of someone deferring the instruction further, because when he gets to the junction an enemy starts shooting at him from the right, and turning left before killing the assailant would be suicidal. “Turn left” here ‘means’ “Turn left when it makes sense to do so.” The same qualification implicitly applies to every instruction. What makes sense depends on all the particulars of the situation. Relevance cannot be determined a priori; the use of “Turn left” is a matter for indefinite amounts and types of situation-specific interpretation.

“Get the amulet” does not tell Sonja what to do in the sense that a program instructs a computer. Getting the amulet can take a lot of work; Sonja may have to navigate about the dungeon and fight off monsters along the way. Sonja is autonomously competent at getting amulets and many other routine game activities; and you can no more tell it to perform the available primitive actions than you can tell me to send a particular sequence of neural impulses to my arm muscles. Instructions do not act like calls on high-level subroutines, either; getting the amulet may have to be interleaved with almost every other activity Sonja is capable of.4

An instruction is an prospective account of activity. What counts as taking it into account?

“Adjust seat tension with provided hex key (N).” What is “seat tension”? Adjust it how? I mean, presumably I’m going to turn a hex nut, but how do I know when the seat is adjusted? Is it supposed to just feel right, or what?

An instruction may suggest what to do, or what the outcome should be—and then it’s up to you to figure out how to accomplish that. “Ensure lip (S) fits below bracket (Q).” Well, it doesn’t, it’s stuck on the bracket, and I can’t see how to slide it down any further. Do I whack it harder? Or undo the last three assembly steps and try again, keeping an eye on the bracket and making sure it behaves? Oh, or maybe it’s already “below” the bracket in the relevant sense, given that the whole thing is upside-down at this point?

The point here is not that instructions are bad (although they often are) nor that people are bad at following instructions (although we often are). It’s that, no matter how good they and we are, it still takes work to figure out what they mean for what we’re doing, and unenumerable considerations may be relevant to that interpretive process.

What’s distinctive about formal rationality is that none of this is necessary. Considerations are not unenumerable, and contextual interpretation is neither required nor allowed. That makes the job radically simpler—in some ways.

Trouble, negotiation, and repair of instructions

As with routine activity generally, instruction interpretation is normally so easy and trouble-free that we overlook its necessity. That’s why I gave the furniture assembly examples, because instruction use there is notoriously difficult. Often you can’t figure out what they are talking about, and have to just plunge ahead and hope for the best; or you think you are following them, but later realize you were doing the wrong thing.

If instructions are given live during the course of the activity, either the instruction giver or taker may recognize trouble, and initiate repair. My program Sonja, when told “Use a knife,” could complain “I haven’t GOT a knife!” In a real-world example, Helen and Claire were watching television, and the oven timer went off:

Helen: Right, could you switch that off for me.
Claire: Where is it?
Helen: No, the television. I’ll sort the oven out.5

Helen realized Claire misunderstood “that” to refer to the oven, because she couldn’t possibly not know where the television was.

When instructions don’t seem reasonable, you may have to negotiate their meaning, just as with any accounts:

E: First you have to remove the flywheel.
A: How?
E: First, loosen the two allen head setscrews holding it to the shaft, then pull it off.
A: OK. I can only find one screw. Where’s the other one?
E: On the hub of the flywheel.
A: That’s the one I found. Where’s the other one?
E: About ninety degrees around the hub from the first one.
A: I don’t understand. I can only find one. Oh wait, yes I think I was on the wrong wheel.6

A distinctive feature of formal rationality, by contrast, is its non-negotiability.

  1. 1.David Chapman, Vision, Instruction, and Action, MIT Press, 1991.
  2. 2.This planning theory of action dominated cognitive science in the 1950s-1980s. I spent the late 1980s successfully rebutting it, which was one main cause of the 1990s AI Winter. Sorry about that.
  3. 3.For an amusing account, see Harold Garfinkel’s Ethnomethodology’s Program: Working Out Durkheim’s Aphorism, pp. 200–207. This is in a chapter on “Instructions and Instructed Action” that probably counts as the locus classicus for the topic in ethnomethodology.
  4. 4.From my Vision, Instruction, and Action, p. 84; lightly edited for clarity.
  5. 5.This dialog is slightly simplified from a transcript in Saul Albert and J. P. de Ruiter, “Repair: The Interface Between Interaction and Cognition,” Topics in Cognitive Science 10 (2018) pp. 279-313.
  6. 6.Transcript from Barbara J. Grosz and Candace L. Sidner, “Attention, intentions, and the structure of discourse,” Computational Linguistics 12:3 (1986), pp. 175-204.