Comments on “Pop Bayesianism: cruder than I thought?”
Comments are for the page: Pop Bayesianism: cruder than I thought?
I suggest you slightly misconstrue the point of Galef’s video. You summarize her message as “don’t be so sure of your beliefs; be less sure when you see contradictory evidence.” Perhaps you would get more out of it if you viewed it rather as a suggestion to use probabilities AT ALL, rather than yes/no dichotomous beliefs.
For example, if I ask the average person, “Do you believe God exists?,” most people would answer either “Yes” or “No.” A few (agnostics) would answer “Maybe.” How many people would say “I assign a 37% chance to God’s existence”? Probably not many. Well, of course nobody’s going to say that, you’d sound like a freak, but I read Galef’s comments as saying that is how we should train ourselves to think, at least implicitly.
This I think is the meaning behind her segment describing the difference between changing one’s mind - i.e. making a total switch from “I believe X” to “I believe not-X” in one big jump in the face of overwhelming evidence - VS. making small changes in one’s probability assignment with each small piece of new evidence. In other words, beliefs in “grayscale” rather than “black and white.”
One benefit of this is that we go from examining our beliefs purely for Correctness and instead also look at how well they are Calibrated. My ability to form factually correct beliefs will always be limited by my available information, my cognitive abilities (“intelligence”) and the time I have available for research and consideration of the topic. However, I can learn to be internally epistemically well-calibrated. That is, if I take all of the propositions to which I assign 20% probability - do 20% of them turn out to be correct? They should be.
This is useful for the same reason it is useful to focus on one’s technique and personal performance in sport, not just the final outcome. Winning a match depends on many outside factors. Getting better over time is about internal factors, i.e. my own performance. Nevertheless this will tend to result in winning more often. Similarly if I think about things probabilistically and learn to be better calibrated, then over time my picture of the world will grow more accurate.
Futhermore, since I cannot update on beliefs if I assign them a 100% or 0% probability, I have to admit to less-than-complete certainty in any case, leading hopefully to more intellectual modesty.
I hope this perspective is useful to you.
I see your Buddhist blog
I see your Buddhist blog criticizing Bayesianism and raise you a Bayesian blog criticizing Buddhism.
I actually have a point there, which is that I think you and he are making similar errors - attacking an entire philosophy because the popular well-known symbol of the philosophy used in very short introductions for laypeople doesn’t contain the entire philosophy in and of itself.
I think Bayes’ Theorem is pretty neat, but it’s neat for reasons you probably can’t figure out if you just stick to Bayes’ Theorem. I think this essay gives some good examples, if you haven’t already read it: http://yudkowsky.net/rational/technical . And from there you can branch out to all sorts of interesting things like complexity priors and calibration training and cognitive biases, but the basic insight is that reasoning can be mathematical and systematic and you can use knowledge to make it better.
an ex-acupuncturist nontheist, statistics fan Agrees !
Excellent write up, David!
This is something I have thought about but not explored - thank you for the summary of important points.
The hyper-rationality (deluding ones’ self of the role of rationality in everyday life) seems to me to be a 6/10 MTAS threat for atheists, for the general population I probably agree with you that it is a 2/10 threat.
So sending your counter-meme out there more actively could serve atheists superbly. Yet, I could see theists picking up on this and getting a simplistic summary of this post itself to dismiss the value of probabilistic skills too.
Ah, the complexity of human self and other manipulation.
Your last paragraphs on non-theistic eternalism is fantastic – I will have to start sharing it in appropriate circles.
As a former acupuncturist, I largely agree with your evaluation. (countering Peter Steel – to whom your comment was perfect). And as you said, Homeopathy would be a more dangerous element.
I look forward to the folks to responded to. I will be back to read more and think about. Thank You
Excellent post, I
Excellent post, I unsurprisingly agree with almost all of it.
I think if you looked at the history of Bayes cultists you would find that many of them are refugees from another cult, that of Ayn Rand and objectivism. If you are caught in that dismal trap, a new belief system that says you don’t have to have absolute certainty about everything would be both a relief and an improvement.
Certainty about uncertainty
Yes, you have captured it perfectly. They are in some sense working the same area as you, but instead of fully coming to terms with nebulosity they are trying to construct a sold layer underneath it. Good luck with that!
This conversation has moved in interesting ways since I was last here!
Your comments suggest you are looking at “pop-Bayesianism” through the lens of your own work on meaningness, which is probably unfair, at least as far as groups like Less Wrong are concerned.
Meaningness is about meaning(lessness), value, purpose, significance. Bayes (actually now I think they are all about Solomonoff Induction as The Answer, which I think has pretty obvious problems that are being overlooked) is about drawing conclusions and making decisions under conditions of uncertainty about FACT, not value.
Now, these groups are doing work on ethics elsewhere (in this area I think many are hindered by the fact that they consider the philosophical tradition wrongheaded and beneath them and so ignore the many contributions other thinkers have made on the same topics they are exploring), but at least at their more basic/fundamental levels, these Rationality groups are trying to formulate methods to come to more correct beliefs and to more effectively achieve one’s ends (they do not necessarily specify what these ends should be).
“What Bayesianism claims to offer is an optimal response to nebulosity.”
I would say this is untrue, or at least only true in the area of factual uncertainty (vs. value or purpose or meaning -type ambiguities).
“That makes it appealing if you can mis-use it to blind yourself to nebulosity in general. “Situation is ill-defined, unstable, complex, mostly unknown? No problem! We’ll just apply Bayesianism, and that guarantees we’ll take the best possible action!”“
Do you think it is a priori impossible to develop best-practices for decision making under conditions of uncertainty?
Less Wrong-types think that it is possible, and they are trying to build it. Interpreting them charitably (and I think accurately), I would say they accept that there are problems with human cognition (heuristics & biases), they want to do better, they need a standard for what better means, and they use mathematical probabilities (with Bayes as the poster child) as that ideal standard. Their question is how to get real humans in real-world situations closer to that ideal. Currently there’s a muddle in the middle.
Another way to say it would be that they are trying to apply the theoretical insights of (aspects of) probability theory and cognitive psychology in real life and/or daily life. A very different project than yours.
So is there meaning to life or not?
no not really
no not really
No One Wrote (or is writing) Your Life
I hope to provide a box of tools for atheists to identify and free themselves from such ideologies, by going to their root—the stances, and emotional needs they address—rather than arguing with belief systems.
It is an excellent project. On my site, I flip between ‘arguing with belief systems” so as to point them to their root emotional uses (or pragmatic uses), and then ‘going to <strike>their</strike>our shared roots’ hoping to point how various belief systems can address them. (see my last diagram).
So I agree with your project, but do feel arguing belief systems can be useful. And that being specific/concrete can be useful.
I love your reply to Jason Clark on the question “Is there meaning to life or not?” And especially your approach that understanding a person’s point-of-view can help us make more meaningful replies – speaking in the abstract can often be a waste of time.
But some additional thoughts:
(1) The mistaken notion that there is some THING called “Life” has been clear to me for a long time. We are so easily fooled by our own intention – language. But to say “life is too complicated” to have meaning seems odd – because complicated things have meaning all the time. Perhaps a persons life is not a “purpose intended thing” and thus can’t be evaluated as a whole to have meaning, much like a play or a novel. Only a mythologizer can accomplish that.
(2) I have no understanding of what you mean by “meanings are not purely subjective” unless by “meaning” you mean “patterned relationships.” Which is a different sense than the normal nuance implied in “The Meaning of Life”, I think. Point: “Meaning” has a lot of senses that gets a conversation into knots quickly.
Curiosities about Thomas Bayes
I don’t think it contributes much to the discussion, but I found interesting that it wasn’t mentioned that Thomas Bayes was a Presbyterian minister.
He is known to have published two works in his lifetime, one theological and one mathematical: 1- Divine Benevolence, or an Attempt to Prove That the Principal End of the Divine Providence and Government is the Happiness of His Creatures (1731) 2- An Introduction to the Doctrine of Fluxions, and a Defence of the Mathematicians Against the Objections of the Author of the Analyst (published anonymously in 1736), in which he defended the logical foundation of Isaac Newton's calculus ("fluxions") against the criticism of George Berkeley, author of The Analyst
By the way, I studied probability (and Bayes’ theorem) in high school, and I liked it, but it was pretty obvious that it was pretty useless in everyday life (as is almost everything teached in high school, but that’s another issue)
Redefining "God": as a method
To fight the naive magical thinking, the blinding patriotic rhetoric and the tribal exclusivism of religion I see several strategies:
(1) Fight the particular idiocies as they come up (frustrating but useful, even if not getting to the root of the problem)
(2) Fight religion in general (not my choice)
(3) Fight the aspects of the “Believing Mind” that generate those and similar secular habits. (my favorite)
(4) Redefine “God” so as to neuter (dis-empower) those aspects (very powerful, but not my path)
I agree with your points on Bayes. I also think many writers centuries ago could not escape their culture enough to allow any of the options above except perhaps #4. Some versions of “God” are far better of others, don’t you think. Perhaps Bayes was, like others, using “God” as a tool to get people to believe what he does. We see the same done in Buddhism - re-definitions, re-workings, re-interpretations – all methods to convince others of another path: harmful or helpful or both.
Value of Bayes' Rule
Also, the formula is almost never useful in everyday life.
On the other hand, once you understand the basic principles of probability,
Bayes’ Rule is obvious, and not particularly important.
I disagree. Although it’s true that the formula is almost never useful in everyday life in the sense that you’d need to do any calculations with it, it’s very useful to understand the extent to which it’s crucial for all your reasoning processes.
Understanding Bayes’ Rule helps clarify just what kinds of things should count as “strong evidence” - it tells you that Y is strong evidence about X being true to the extent Y is the kind of thing that you would only see if X were true (i.e. only X and nothing else would make you see Y). That’s a very generally applicable rule that’s useful for evaluating the correctness of a lot of different things. Yes, it is certainly possible to understand the above even without knowing the exact formula, but at least I personally found that knowing the exact reason (read: the exact math) for that rule made it feel like I understood it better.
It also helps understand why people tend to hear what they expect to hear instead of what the other speaker is saying: see e.g. http://lesswrong.com/lw/hv9/rationality_quotes_july_2013/9alt . Not only that, it makes me personally more aware of the way in which I personally might misunderstand the claims or experiences of others, and reminds me to consciously consider alternative explanations to somebody’s words/motives besides just the first explanation that pops to mind, since I know that my
priors may or may not be correct.
http://lesswrong.com/lw/2el/applied_bayes_theorem_reading_people/ (not very happy with this post, but it should get the rough ideas across)
And generally it helps me remember that in order for my beliefs to be accurate, I have to update them in a way that actually makes them correspond with reality, and Bayes’ theorem shows many of the necessary preconditions for that.
That said, I do agree that just saying “Bayes’ theorem!” is often unhelpful, and that one would instead need lots of worked examples of its implications to make it theorem really useful. (And maybe you could just give the examples and skip the formula entirely.) I tried to do something like that with the “What is Bayesianism” article as well as the “Applied Bayes’ Theorem” one, but I don’t think I did very well on either.
Formal and informal rules
your offered formulation isn’t quite what I was after, though I do appreciate the attempt to find a sympathetic interpretation. Let me try to rephrase…
If I understood you correctly, you said that “pop Bayesianism” doesn’t seem to make sense, because it claims to use Bayes’ Rule - which is a rule for calculating probabilities with numbers - despite almost never actually invoking any specific numbers. Would any of these examples make more sense to you?
A physicist, after having learned Newton’s laws of motion, knows that his infant child should be tightly secured while in a car, because the child will continue its motion even during a sudden stop, and won’t be easily held in place.
A computer scientist, after having learned the formal definitions for computational complexity classes and had some experience with applying them, finds that knowledge useful in doing informal guesses of how hard some problem might be, despite never running any numbers while doing so.
A philosopher, after having learned formal logic, intuitively recognizes patterns of logic in people’s statements, even without doing a formal analysis on them.
What I’m trying to convey here is the notion that once you have learned a formal rule describing some phenomenon, then you will start recognizing the structure described by that rule even in situations where you don’t have the exact numbers… that even if you can’t do an exact calculation, knowing the formalism behind the exact calculations will make it possible to get a rough hunch of how the thing in question could be expected to behave, given the formalism and your intuitive guesses of the general magnitude of the various numbers involved. The physicist might not bother to calculate the exact speed at which an unsecured child would be moving during a sudden stop, but he knows that it is far too fast.
The examples in my previous post, then, were intended as examples of ways in which knowing Bayes’ Rule and having some experience of some of its applications lets you recognize the “Bayes structure” of things and get a rough handle of how they behave, even when you can’t do the exact numbers.
All of that said, I do admit that I am still not personally sure of exactly how much of the “pop Bayesian” influence on my thought has actually come from an understanding of the actual Bayes’ rule itself, since I’ve only spent a rather limited time actually plugging in numbers to it. It could be that what most benefited me were the various qualitative examples of how to reason in a probabilistic manner, and general discussions about things like priors, the origins of our knowledge, and what counts as valid evidence. I would guess the same to be the case with many other “pop Bayesians”, so you could be right that it’s more of a symbol than anything.
Still, I do think that there is value in also teaching the rule itself, since it can help make the various informal arguments more clear and concrete…
Might or might not work :-)
The claim that "knowing Bayes' Rule is heuristically valuable even in the absence of numbers" is empirical. Is there any evidence for it?
All of my evidence is purely anecdotal, with all the confidence that that implies. :-)
However, I think that teaching the "balls in boxes" (frequentist) formulation for probability would probably be much better. Once you understand that formulation, it's easy to see how to apply it to many different sorts of probabilistic circumstances.
Oh, I’ve been implicitly presuming all along that the target audience already knows the very rudiments of probability, and the basic frequentist formulation of it. Though now that you point it out, that’s probably an artifact of me spending excessive amounts of time in geek circles, and not very representative of the population at large…
But yeah, if I personally were to write an introduction to these kinds of things now, I might not even mention Bayes’ rule until some later “advanced course”.
Well, hmm, it seems to me that LW has built an enormous edifice, or castle in the air, on anecdotal evidence for the efficacy of Bayesianism. Oughtn't that to bother those in the movement? Especially since evidence is the sacred principle of the movement?
When you’re saying “Bayesianism”, here, are you referring specifically to using Bayes’ rule as an instruction tool, or the whole broader emphasis on how to use probabilistic thinking? What I meant to say was that I admit that the specific claim of “knowing Bayes’ Rule is heuristically valuable even in the absence of numbers” doesn’t necessarily have strong support. But “Bayesianism” in the LW sense covers much more than that, and is more broadly about probabilistic thinking and e.g. the nature of evidence in general…
And from what little contact I’ve had with them, CFAR’s agenda is even more broad - one of their instructors was in town and trialed a small workshop that was basically about staying calm in an argument and being able to analyze each person in the argument actually wanted, so that you could deal with the ultimate issues instead of degenerating into a shouting match. Of course, they probably wouldn’t claim that that falls under “Bayesianism”, but I don’t think they’d claim that Bayes’ Rule is all you need to know about rationality, either. (Though I don’t know CFAR that well.)
Does the average LW reader actually understand classical decision theory and how to apply it in real-world situations? I now think: probably not.
I would guess so, too. Though of course, the average LW reader is also likely to be a lurker, and less likely to be loudly praising Bayesianism; but the statement may quite likely still hold even if you change it to refer to the average LW commenter. Then again, I’m not sure of the extent to which the average LW commenter will be loudly praising Bayesianism, either… actually, now that I think of it, Yudkowsky is the only one whose LW writings I’ve seen explicitly saying that Bayesianism is something great and fantastic and incredible.
LW's definition of Bayesianism
The closest thing that LW has to an introduction to classical probability is probably Eliezer’s Intuitive Explanation of Bayes’ Theorem, though I think that it already assumed some understanding of what probability is; I’m not entirely sure.
One thing that’s worth keeping in mind is that LW’s definition of “Bayesianism” is somewhat idiosyncratic. While Eliezer does often make disparaging comments about frequentist statistics, the main ideas aren’t so much about how to apply formal Bayesian statistics - indeed, formal statistics of either kind haven’t been very much discussed on LW. What LW calls “Bayesian” is more of a general mindset which says that there exist mathematical laws prescribing the optimal way of learning from the information that you encounter, and that although those laws are intractable in practice, you can still try to judge various reasoning processes and arguments by estimating the extent to which they approximate the laws.
“Frequentists” or “non-Bayesians” in LW lingo doesn’t refer so much to people who use frequentist statistical methods, but to people who don’t think of reasoning and probability in terms of laws that you have to obey if you wish to be correct. (Yes, this is a confusing way of using the terms, which I think is harmful.) For example, from Eliezer’s Beautiful Probability:
And yet... should rationality be math? It is by no means a foregone conclusion that probability should be pretty. The real world is messy - so shouldn't you need messy reasoning to handle it? Maybe the non-Bayesian statisticians, with their vast collection of ad-hoc methods and ad-hoc justifications, are strictly more competent because they have a strictly larger toolbox. It's nice when problems are clean, but they usually aren't, and you have to live with that.
After all, it's a well-known fact that you can't use Bayesian methods on many problems because the Bayesian calculation is computationally intractable. So why not let many flowers bloom? Why not have more than one tool in your toolbox?
That's the fundamental difference in mindset. Old School statisticians thought in terms of tools, tricks to throw at particular problems. Bayesians - at least this Bayesian, though I don't think I'm speaking only for myself - we think in terms of laws.
Looking for laws isn't the same as looking for especially neat and pretty tools. The second law of thermodynamics isn't an especially neat and pretty refrigerator.
The Carnot cycle is an ideal engine - in fact, the ideal engine. No engine powered by two heat reservoirs can be more efficient than a Carnot engine. As a corollary, all thermodynamically reversible engines operating between the same heat reservoirs are equally efficient.
But, of course, you can't use a Carnot engine to power a real car. A real car's engine bears the same resemblance to a Carnot engine that the car's tires bear to perfect rolling cylinders.
Clearly, then, a Carnot engine is a useless tool for building a real-world car. The second law of thermodynamics, obviously, is not applicable here. It's too hard to make an engine that obeys it, in the real world. Just ignore thermodynamics - use whatever works.
This is the sort of confusion that I think reigns over they who still cling to the Old Ways.
No, you can't always do the exact Bayesian calculation for a problem. Sometimes you must seek an approximation; often, indeed. This doesn't mean that probability theory has ceased to apply, any more than your inability to calculate the aerodynamics of a 747 on an atom-by-atom basis implies that the 747 is not made out of atoms. Whatever approximation you use, it works to the extent that it approximates the ideal Bayesian calculation - and fails to the extent that it departs.
Bayesianism's coherence and uniqueness proofs cut both ways. Just as any calculation that obeys Cox's coherency axioms (or any of the many reformulations and generalizations) must map onto probabilities, so too, anything that is not Bayesian must fail one of the coherency tests. This, in turn, opens you to punishments like Dutch-booking (accepting combinations of bets that are sure losses, or rejecting combinations of bets that are sure gains).
You may not be able to compute the optimal answer. But whatever approximation you use, both its failures and successes will be explainable in terms of Bayesian probability theory. You may not know the explanation; that does not mean no explanation exists.
So you want to use a linear regression, instead of doing Bayesian updates? But look to the underlying structure of the linear regression, and you see that it corresponds to picking the best point estimate given a Gaussian likelihood function and a uniform prior over the parameters.
You want to use a regularized linear regression, because that works better in practice? Well, that corresponds (says the Bayesian) to having a Gaussian prior over the weights.
Sometimes you can't use Bayesian methods literally; often, indeed. But when you can use the exact Bayesian calculation that uses every scrap of available knowledge, you are done. You will never find a statistical method that yields a better answer. You may find a cheap approximation that works excellently nearly all the time, and it will be cheaper, but it will not be more accurate. Not unless the other method uses knowledge, perhaps in the form of disguised prior information, that you are not allowing into the Bayesian calculation; and then when you feed the prior information into the Bayesian calculation, the Bayesian calculation will again be equal or superior.
When you use an Old Style ad-hoc statistical tool with an ad-hoc (but often quite interesting) justification, you never know if someone else will come up with an even more clever tool tomorrow. But when you can directly use a calculation that mirrors the Bayesian law, you're done - like managing to put a Carnot heat engine into your car. It is, as the saying goes, "Bayes-optimal".
The power in that mindset, I would say, is that you can no longer just believe in something or think something and just automatically assume that it’s correct. Instead, you are forced to constantly question and evaluate your thought processes: is this the kind of an inference that would actually cause me to have true beliefs?
As for your question of how you actually use it… in my original comment I gave some examples of ways to check your reasoning processes by checking whether they follow Bayes’ rule. There are a bunch of other LW articles that apply either Bayes’ rule or the more general mindset of lawful reasoning. It feels a little rude to throw a dozen links at someone in a conversation like this, but since you asked for examples, some semi-randomly picked ones: Absence of Evidence is Evidence of Absence, Conservation of Expected Evidence, Update Yourself Incrementally, One Argument Against An Army, What is Evidence?, What Evidence Filtered Evidence?, and Privileging the Hypothesis (I’ll single out “What is Evidence?” as one that I particularly like, and “Privileging the Hypothesis” points out a fallacy that I often started to realize I was guilty of, after I read the article).
I wouldn’t agree with your characterization of Bayesianism - at least in its LW version - offering certainty, however. Yes, it talks about laws that might lead us to the truth if we follow them… but it also makes a strong attack on various ways to rationalize pleasant beliefs to yourself, and undermines the implicit notion of “I can just believe whatever I want” that many people subconsciously have, even if they wouldn’t admit it out loud.
This undermining happened to me - for example, I would use to have beliefs that I didn’t want to talk about in public because I knew that I couldn’t defend them, but reading LW for sufficiently long made me actually internalize the notion that if I can’t defend it, I don’t have any reason for believing in it myself. That might sound obvious, but it is a lot easier said than done.
The message I get out of LW-style Bayesianism, and what a lot of people seem to get out of it, is rather one of massive uncertainty - that we cannot know anything for sure, and that even if we make our best effort, nothing requires the universe to play fair and not screw us over anyway… certainly to me, it feels like reading LW made me become far, far less certain in anything.
As for your question of “where did the non-probability parts of rationality go” - well, I’ve only been discussing the probability parts of rationality because those were the topic of your original post. Certainly there’s a lot of other rationality discussion on LW (and CFAR), too. Though LW has traditionally been most focused on epistemic rather than instrumental rationality, and probability is a core part of epistemic rationality. I gather that CFAR is more focused on instrumental rationality than LW is. I would assume that this “Checklist of Rationality Habits” on CFAR’s website would be more representative of the general kind of stuff they do.
Also, here’s the schedule of a May CFAR workshop, which only has Bayes stuff as one part of the curriculum.
The core of LW Bayesianism
Thanks for the CFAR link! I was clearly just wrong about that.
Glad I could be of help!
I can argue this, but it would require extensive steelmaning, because (so far as I can tell) LW doesn't make the claim specific enough (and presents little if any empirical evidence).
If I did that—which would take months of work—would it change many peoples' minds? (Genuine question; I genuinely wonder whether it's worth doing.)
I guess that would depend on what you’d count as changing people’s minds? What exactly is the position that you find harmful and wish to subvert, here?
As you’ve noted, the core technical argument for Bayesianism isn’t really made properly explicit in the Sequences. That being the case, it makes me dubious about whether a detailed refutation of the technical argument would do much to change people’s minds, since the persuasiveness of the argument never came from there in the first place. Rather, I think that the thing that has made the Sequences so attractive to people are the small practical insights. For example, if I were to briefly summarize the insights that made me link to the posts that I did in my earlier comment:
- Absence of Evidence is Evidence of Absence: What the name says.
- Conservation of Expected Evidence: If you would interpret a piece of evidence to be in favor of your hypothesis, you should interpret evidence that’s in the opposite direction as being contrary evidence for the hypothesis; you can only seek evidence to test a theory instead of confirming it.
- Update Yourself Incrementally: Hypotheses aren’t all-or-nothing: it’s fine to be aware of contrary evidence to a hypothesis and still hold that hypothesis, as long as the contrary evidence isn’t too strong.
- One Argument Against An Army: You should always make sure that you aren’t selectively double-counting some of your evidence.
- What Is Evidence?: Evidence is an event entangled by links of cause and effect with something that you want to know about; for an event to be evidence about a target of inquiry, it has to happen differently depending on the state of the target.
- What Evidence Filtered Evidence?: If you’re expecting the possibility of your source filtering away some of the evidence that would go against their conclusion, you shouldn’t trust their conclusion too strongly.
- Privileging the Hypothesis: A hypothesis has to already have considerable evidence in its support before you should even seriously consider it; it’s no use to ask ”but there’s no strong evidence against it, right?” if there’s no strong evidence in favor of it.
Now the argument that you just made against Bayesianism – of the tough part being the selection of the space of evidence and hypotheses and actions – sounds reasonable and probably correct to me. Does that argument shake my confidence in any of the insights I summarized above? Well, since I should update myself incrementally ;-), I need to admit that this is something that I should count as evidence against any insights derived from a Bayesian framework… but it doesn’t feel like very strong evidence against them. The content of these posts still seems generally right, supporting the case for Bayesianism – and it intuitively seems like, even if we cannot yet figure out the answers to all the hard questions that you outlined, the fact that this line of reasoning has provided a coherent unifying framework for all these (obvious in retrospect) ideas suggests that the truth is at least somewhere in the rough same direction as Bayesianism. I would expect that, to the extent that it effectively arrives at the truth, my brain still works along Bayesian lines - even if I don’t know exactly how it does that.
I’m reminded of Scott’s notion about people having different standards for what counts as a ”trivial” problem for a philosophical theory. To many LW users (me included), there exists an immense amount of small practical insights in the Sequences, ones which seem obvious in retrospect but which might have taken us a long while to think of ourselves: and a large part of those insights are presented in the context of a unifying Bayesian framework. So once you point out that there are deep unsolved - perhaps even unsolvable in principle - problems with formally applying Bayesian methods… then this particular technical failure of the framework, when that framework has provided plenty of practical successes, feels likely to register as a ”trivial” problem.
I would expect that if you wanted to actually de-convince people on LW of Bayesianism, it wouldn’t be enough to just show the problems that it has – you’d also need to answer the question of ”if not Bayesianism, then what?”. Even if your argument was successful in reducing people’s confidence in Bayesianism being correct, that still doesn’t do much if they don’t have any plausible alternative hypotheses.
As an aside, I suspect that part of the reason why many people found the Sequences so convincing and why many other people don’t find them particularly insightful, has to do with the way that they are (were) read. The posts were originally written over an extended period, and many of us began reading them as a bit of interesting entertainment that would pop up in our RSS feeds once a day. In order to properly learn a concept, you need to encounter it several times with slight variations, and the actual message being spread out over many posts was originally helpful in this respect. It spread out the message over several days of reading and thus helped internalize it better than if there had been just one clear, to-the-point post - that you read once and then forgot.
Compare reading the Sequences over several years with reading, over a couple of days, a book that had all of those same insights expressed in a more concise way. One might be quite impressed with the book, but with there being so much information packed into such a short time, people would end up just forgetting most of it within a few days or weeks. The Sequences, in contrast, offered a series of posts examining a particular mindset from slightly different angles, keeping a few ideas at a time active in the reader’s mind as the reader went on with her daily life. That gave the reader a much better opportunity to actually notice it when she encountered something related to those ideas in her life, making her remember it and all the related ideas better.
But now that nobody is reading the posts at a one-per-day rate anymore, the style and format seems harmful in getting the message across. When you’re reading through a (huge) archived sequence of posts, unnecessary fluff will just create a feeling of things having being written in a needlessly wordy way.
Yes; that's a very good thing. But is the LW approach the best way to bring about that sort of questioning? There are many other pedagogical approaches available (e.g. "critical thinking" in the humanities, or just getting a decent general STEM education).
Well, the LW approach has clearly worked for some people. For others, other kinds of approaches are more effective. As far as I can tell, CFAR’s agenda is to experiment with different kinds of approaches and figure out the ones that are the most effective for the largest fraction of the populace.
Empirically, LW seems to lead people into metaphysical speculation and obsession with peculiar unlikely future scenarios.
I would expect that a large part of the metaphysical speculation on LW is due to the LW approach appealing to the kinds of people who already enjoy abstract metaphysical speculation. As for the peculiar unlikely future scenarios… well, as someone who found the AI risk argument a major and serious one even before LW existed, I cannot consider it a bad thing if more people become aware of it and give it serious attention. :-)
Yes... What's interesting is a kind of dual motion: uncertainty at the object level, and certainty at the meta level. LW seems to go out of its way to provoke anxiety about factual uncertainty ("maybe evil computers are about to take over the world!"), which it then relieves at the meta level ("but we know the eternal, ultimate principles and methods that allow us to devise an optimal response strategy!").
This seems to me relevantly similar to the emotional appeal of salvation religions: "you will almost certainly go to hell! except that we have the magic principles and methods that guarantee eternal bliss instead."
That’s an interesting observation. While people have made religious comparisons of LW before, I’m not sure if I’ve seen it phrased in quite this way earlier.
The evolution of LW
Glad to be of help, again. :)
Your tentative model sounds roughly correct. I’m not sure of exactly how much of the quasi-religion even is present in practice: while it’s clearly there in the original Sequences, I haven’t observed very much of it in the discussions on the site.
I would say that LW is already evolving in the description you described. For example, looking at this year’s “Promoted” articles, only 4 out of 43 are by Eliezer, and those four are all either summaries of math papers or, in one case, an advertisement of MetaMed. And like I already mentioned, I don’t get a very strong “magic” vibe from the actual discussions in general. The only exception I can think of are some of the solstice meetup reports.
My impression is also that CFAR is very much what you described as LW 2.0, but I’m again not very familiar with them, as they’re basically focused on doing things in real life and have been rather US-centric so far, while I’m here in Europe.
I thought you might appreciate some additional comments; for background, I’m the guy that wrote LW’s introduction to Ron Howard-style decision analysis. LWers do seem to appreciate discussion of the technical bits, though I don’t think everyone follows them.
As someone who understands all the technical detail, I agree with you that the quasi-religious aspects of LW are troubling. But I think a lot of that is that it’s more fun for EY to talk that way, and so he talks that way, and I don’t think that it’s a significant part of the LW culture.
I think the actual quantitative use of Bayes is not that important for most people. I think the qualitative use of Bayes is very important, but is hard to discuss, and I don’t think anyone on LW has found a really good way to do that yet. (CFAR is trying, but hasn’t come up with anything spectacular so far to the best of my knowledge.)
I think that tools are best when they make the hard parts hard and the easy parts easy, which I think Bayes is good at doing and non-Bayesian tools are bad at doing. With Bayes, coming up with a prior is hard (as it should be) and coming up with a decision rule for the posterior is hard (as it should be). With, say, a null hypothesis test with p=.05, the prior and decision rule are both selected at default. Are they appropriate? Maybe, maybe not- but such discussions are entirely sidestepped without Bayes, and someone can go through the motions without realizing that what they’re doing doesn’t make sense. (This xkcd strip comes to mind. Yes, they used the wrong test- but is it obvious that they did so?)
For instance, I don't believe there is any physicist who deduced that he should use a child's car seat (even without numbers). You do that because everyone else does, because it's legally required, and because it is an obviously good idea, based on your own felt experience with seatbelts and cars stopping suddenly.
You might be interested in this anecdote, in which a physicist explains to a mother what deceleration is like. The relevant bit:
“But I’m holding on to him!”, the woman started saying, but by the time she made it to the end of her sentence, she was full-on screaming, completely losing her cool. “Nothing bad is going to happen to him!!” “Right, please hear me out,” Syd got them back on track. “Say that you go from 30 miles per hour to zero in the space of about a foot and a half, right? That is an acceleration of about 20 times the earth’s gravity. That means that your 1-stone baby would go from 1 stone of weight in your arms, to about 20 stone moving away from you.” The woman just stared at Syd, but her eyes showed that she was trying to envision holding on to a 20-stone baby. “I guess what I am asking: Would you be able to hold a 20-stone item the size of a large watermelon in your arms in the middle of a car crash?” “I…”, she said. “Be honest.” he said. “No, you wouldn’t. You wouldn’t have a chance. Which means that if you guys had been in a crash, your baby would have gone flying straight into the windshield. There, he wouldn’t have had the benefit of being slowed down gradually: Instead of being slowed down by the seat belt over a distance of about a foot and a half, he would be brought to a stop in an inch or less. Then…” Syd was on a roll, and I could tell that he was about to launch into a further explanation. He was so caught up in his own discussion, that he hadn’t seen how pale the woman had gotten. I shook my head at him. He spotted me, and stopped his train of thought. “You get the picture; I don’t need to explain. Suffice to say that it’s unlikely that your baby would survive an impact like that.” The woman went even more pale, and was hugging her child closely. She looked, for a moment, as if she might keel over, but her husband stepped in and put an arm around her.
Hey David, you have quite a lot of thought-provoking stuff here on this website. I found a link to this article while browsing Lesswrong, and, after seeing you display high level of sanity in comments, have read through most of your “book contents”.
Unfortunately, I can’t quite seem to make sense of your core ideas and I’d appreciate you clarifying a few points.
- What is your definition of the word "meaning" that you use throughout the book? I have tried substituting both my own intuitive definition and the dictionary one - both substitutions seem to result in nonsense.
- Are stances (within one specific dimension) discrete categories of thinking that each person falls squarely in, or are the two confused stances bounds of the worldview distribution, with most people being in-between? If it's the continuous case, is the complete stance just a roughly 50% A 50% B mix? You describe the completes like they are transcendent with respect to confuseds, not merely a partway point. It seems to me that the discrete case is what you believe in, but the continuous one is what fits my real-world experiences the best.
- You go in detail on each of the complete stances, but are they the only/the only true compromises between confuseds? In most dimensions, I partially agree with/compromise between/partially accept both confuseds, and yet my worldviews of those dimensions look nothing like your descriptions of the complete stances
- Are your views falsifiable? Can you think up and tell us a few specific examples of evidence that, if observed, would cause you to abandon your whole worldview of "meaningness"?
- The whole worldview kind of looks like yet another attempt at the Answer to Life, the Universe, and Everything, but you said you don't like those. What am I missing here that makes it special?
This discussion strikes home for me
As someone who lurks on LW without a background in classical statistics, I thought you might find my experiences helpful.
The “small, practical insights” are great! Unfortunately they are often found within walls of not-small text, inter-mixed with personal narrative and quasi-religious language. It all seems a bit cult-ish, which Eliezer obviously realizes and pokes fun at. However the feeling remains that he hasn’t engaged the wider scientific community and defended his beliefs.
The lack of discussion of non-bayesian methods is especially troubling. I rarely use much statistics in my work (software engineering); when I do I look up specific tools which apply to the task at hand, use them, then largely forget about them. I just don’t use statistics often enough to stay skilled in it. So when I encountered LW I thought “oh neat, they’ve found a simpler, more general theory of probability I can more easily remember”.
So I started fiddling with and thinking about Bayes’ Rule, and it seemed like you could infer some important insights directly from it. Things like “absence of evidence is evidence of absence” (which we already felt, but its nice to know), and how priors influence our beliefs as new evidence is acquired.
Useful stuff, I thought. But without any contrast to classical methods, I don’t know how unique or event valid these insights are.
When I need to Get Something Done that requires I learn a concept in statistics, decision theory or rationality (this has happened all of twice), I will always consult wikipedia and its peer-reviewed sources before LW. LW feels good for provoking thought and discussion, but not actually getting stuff done. In the Hansonian sense it feels “far”, while sources such as wikipedia are “near”.
Analogical inference as a possible alternative to Bayes
For years I’ve been extremely sceptical of Yudkowsky’s claim than Bayesianism is some sort of ultimate foundation of rationality.
I think you got to the heart of the matter when you pointed out that Bayes cannot define the space of hypotheses in the first place, it only works once a set of pre-defined concepts is assumed. As you state;
“The universe doesn’t come pre-parsed with those. Choosing the vocabulary in which to formulate evidence, hypotheses, and actions is most of the work of understanding something.”
What you describe is the task of knowledge representation, or categorization, which is closely related to the generation of analogies, and is PRIOR to any probabilistic calculations. Now it may turn out to be the case that these things can be entirely defined in Bayesian terms, but there is no reason for believing this, and every reason for disbelieving it. Some years ago, on a list called the everything-list, I argued the case against Bayes and suggested that analogical inference may turn out to be a more general framework for science, of which Bayes will only be a special case.
Here’s the link to my arguments:
In my summing up, I listed ‘5 big problems with Bayes’ and pointed out some preliminary evidence that my suggested alternative (analogical inference) might be able to solve these problems. Here was my summary:
(1) Bayes can’t handle mathematical reasoning, and especially, it
can’t deal with Godel undecidables
(2) Bayes has a problem of different priors and models
(3) Formalizations of Occam’s razor are uncomputable and
approximations don’t scale.
(4) Most of the work of science is knowledge representation, not
prediction, and knowledge representation is primary to prediction
(5) The type of pure math that Bayesian inference resembles (functions/
relations) is lower down the math hierarchy than that of analogical
For each point, there’s some evidence that analogical inference can handle the
(1) Analogical reasoning can engage in mathematical reasoning and
bypass Godel (see Hoftstadler, Godelian reasoning is analogical)
(2) Analogical reasoning can produce priors, by biasing the mind in
the right direction by generating categories which simplify (see
Analogy as categorization)
(3) Analogical reasoning does not depend on huge amounts of data thus
it does not suffer from uncomputibility.
(4) Analogical reasoning naturally deals with knowledge representation
(analogies are categories)
(5) The fact that analogical reasoning closely resembles category
theory, the deepest form of math, suggests it’s the deepest form of
I just watched Galef’s brief video, and I must say the point of here talk seemed to me to be something that you have apparently totally missed. What she is talking about is a formal but flawed mode of reasoning used frequently by scientists, and something that happens all the time when people use informal reasoning. It’s called the base-rate fallacy, it’s crappy reasoning, and when you know what Bayes’ theorem means, it’s obviously wrong: P(H | D) is not the same as P(D | H). This is what her example about being or not being a good driver was about - its not enough that the hypothesis fits the data, you must look at how other hypotheses fit the data also.
For your convenience, regarding the base-rate fallacy, see my brief article
and linked material.
(By the way, the commenter above me, Marc, complains that Bayes’ theorem doesn’t specify a hypothesis space. This is correct, it’s called theory ladeness, and its just something we have live with - no other procedure can provide this, or any alternative. To complain about not having the hypothesis space laid out in detail is to complain about not being omniscient. If there was some unique, true hypothesis space, attainable by some procedure, then what would it look like? Why would it contain any hypotheses beyond the one true proposition? Where would this miraculous information come from?)
Unconvinced of the harm
I agree with the pedagogical issues with teaching Bayes, and the issue of the worship of the all mighty Bayes rule.
However, you mention that pop Bayesianism might do more harm than good (beyond the religiosity) and I am not convinced. The only evidence you have given is that you have observed some Bayesians being very confident in beliefs that you think that they shouldn’t be confident in, though you don’t tell us any of those. I would like to hear some of those. And are these beliefs associated with the meme cluster that Bayesianism tends to be attached to (LW things like FAI, cryonics, etc.)?
By the way, here is a brief summary of the Bayes unit at CFAR. The basic tool is to used how surprised you are of certain odds to approximate numerical probabilities based on your beliefs. You use this to generate an approximate numerical prior in the form of odds, and a likelihood ratio, then multiply to get an order of magnitude approximation of the posterior. I find the most useful part to be the calculation of the likelihood ratio.
Would I be surprised if he got in a life threatening car accident 1 time for every 1,000,000 times he goes out with friends? Yes? How about 5:1M? Yes? 20:1M? Not really. Lets go with 10:1M (you could probably use actual stats for this). Now, would I be surprised if he got in an accident 1 times for every 10,000 times? Yes. [etc.] Ok, lets go with 30:1M. So a good prior would be about 20:1M.
Now this probably isn’t the greatest example since the numbers are so large, but I think that the likelihood ratios are the interesting part:
Now lets say he got in a car accident. How often would he fail to call by 11PM? Essentially 1/1. Now lets say he didn't get in an accident. How often would he fail to call by 11PM? [...uses above method for finding a probability...] about 3/10 , he is a little disorganized and forgetful. So our likelihood ratio is 10:3 or about 3:1. So he is only 3 times more likely to get in a fatal accident if he fails to call, or 60:1M chance, or 0.006%. I should stop my unreasonable worrying.
Anyways, as some other people have mentioned, I find that the most useful part of Bayes is the qualitative stuff. It also tells you why you should update in the ways that seem intuitively reasonable.
The rest of the video goes on to say that Bayesianism boils down to “don’t be so sure of your beliefs; be less sure when you see contradictory evidence.” Now that is just common sense. Why does anyone need to be told this? And how does the formula help?
What do you mean by common sense? And I think you are being a little bit optimistic. These things (and other such things mentioned your article) seem like common sense upon reflection, but you probably won’t notice your brain violating those rules anyways.
I'm actually more confident about FAI and the Singularity than LW folks. I think their probability in the next 40 years is < 0.1, whereas the average LW person might say 0.5.
The 2011 Less Wrong Survey had a question for what people thought was the most likely date of the Singularity. “The mean for the Singularity question is useless because of the very high numbers some people put in, but the median was 2080 (quartiles 2050, 2080, 2150).” The results page of the 2012 survey doesn’t mention the Singularity dates, but downloading the raw data, it looks like that year’s quartiles were 2045, 2060, 2080.
It might also be relevant to mention the results for the “Which disaster do you think is most likely to wipe out greater than 90% of humanity before the year 2100?” question. In 2011, “Unfriendly AI” got 16.5% of the votes and came out second (behind bioengineered pandemics at 17.8%). In 2012, it got 13.5% of the votes and came out third (behind bioengineered pandemics at 23% and environmental collapse at 14.5%).
What I find particulary weird
What I find particulary weird about this LW obsession with singularity is that I haven’t seen in LW any evidence whatsoever that people are getting results that get them closer and closer to singularity.
This is markedly different
This is markedly different from the LessWrong survey results you mention. I find that striking; but I've no idea what it implies about LessWrong (or AI). Any thoughts about that?
No, not really. I could try to come up with a hypothesis, but I’d just be making stuff up. :-)
Incidentally, the reason why I personally find AI in 50 years to be quite plausible is because of the progress we’re making in neuroscience. We’ve already successfully created brain prostheses that replicate hippocampal and cerebrellar function in rats, and there are claims that we wouldn’t need any big conceptual breakthroughs to implement whole brain emulation, just gradual improvement of already existing techniques. One distinguished computational neuroscientist was also willing to go on record about us not needing any huge conceptual breakthroughs for creating neuroprostheses that would mimic human cortical function, and co-authored a paper about that and its consequences with me.
If we’re already this far now, it wouldn’t seem at all implausible that we’d have reverse-engineered the building blocks of intelligence within the next 50 years. Though obviously that’s also what the AI gurus of 50 years ago thought, so I wouldn’t say that “AI within 50 years” is certain, just plausible.
I agree that often
I agree that often Bayesianism is not about Bayes. Still, I think you underestimate the value of the framework of thought. Personally, I found Conservation of Expected Evidence to be a useful idea that was not obvious to me in advance. Similarly, I never realized how important priors were in making predictions. Finally, I think the Bayesian approach lends itself very easily to considering several different possibilities at once. These skills/ideas can be taught outside a Bayesian framework, but I don’t see any compelling reason to avoid it. While the ideas might not be exclusive to Bayes, they still deserve to be promoted. And since these ideas are implicit in the theorem, I don’t mind that the theorem is the focus of promoting these ideas.
I do think that Bayes is often used as intimidation tactic or shibboleth. But that isn’t the theorem’s fault, no matter what idea was used in its place similar events would occur. Criticizing the way people deploy Bayesianism is fine, but like we both agree on much use of Bayesianism is not about Bayes, and so even if bad ideas also are in the area “true” Bayesianism still seems like a good thing to support.
Above, you claimed that a
Above, you claimed that a major problem with Bayesianism is that the universe does not come prepackaged with hypotheses. However, human brains are born with intuitions about causality and inference, which I think suffices as a starting point for Bayesianism or something closely akin to it, at least once badly conflicting intuitions are recognized and repaired through reflective equilibrium. I do not think any epistemic framework can avoid relying on human inferences, so I do not see why you think it is a problem with Bayesianism that the universe does not hand us hypotheses or reference classes or objective priors.
Sorry to dig up this old post, again there’s a topic here that’s been in my head a lot recently on and off. I’ve tried not to make this too ranty, I really like the LW people but have a similar fascination/frustration dynamic to you.
In the comments here you write:
Part of LessWrong 2.0 could be a new presentation of the small practical insights that didn’t involve the big, wrong claims about Bayesianism.
I think LW has already made some nice steps towards this under the heading of ‘instrumental rationality’. The bit that interests me most is that that side of LW is often very insightful on the kinds of preverbal/emotional content that ideas have associated with them: there’s a pretty sophisticated vocabulary of ‘ugh fields’ and ‘affective death spirals’, and Alicorn wrote a whole sequence of posts on introspection and trying to pick apart the background affective tone you associate with beliefs, with the intention of maybe changing them (or in Silicon Valley terms ‘hacking yourself to liking different stuff’).
This is very close to the territory I’m going over with my mathematical intuition obsession, so it’s helpful for me to read. And they write in a language I can understand! Normally I’m worried that I’m going to have to go and read reams of obscure text by continental phenomenologists or something, so I definitely appreciate this.
CfAR seems to do even more of this, but I think they’re also increasingly tightly coupled to MIRI/x-risk… not really sure what’s going on there.
I really don’t understand how this side of LW fits with the side that’s obsessed with formalising everything in some improbably narrow mathematical framework. They seem completely at odds. And maybe I’m being unfair but it always seems to have the feeling on LW of being a kind of sideshow, like it’s being treated as a set of interesting soft questions for the less technically minded, while the grown-ups get on with the serious business of trying to make ethical statements take values in the real numbers or whatever. I’m convinced it’s the good bit though!
LW and CfAR
Thanks for the reply! It’s true that I’m missing a lot of context. I’ve hung round the edges of ‘rationalist-adjacent tumblr’ quite a bit, but that’s really its own separate thing, and geographically the interesting stuff is mostly going on thousands of miles from me. I’m fascinated by what I can glean online, though - it’s one of those things where some parts of the community really appeal to me and some really don’t, so I just keep coming back and poking around some more.
That’s interesting that CfAR stopped the Bayesian part of their curriculum - I was still under the impression that that was important to them. Some details of their new content have been published on the LW discussion pages, though, and ‘mostly psychological techniques for getting past personal limiting assumptions’ is probably accurate. It looks good to me!
I guess what I was trying to get at is that the new CfAR-type-stuff seems to point to quite a complex, messy account of how we learn new things and change our minds, whereas old-school Less Wrong seemed to be obsessed with manipulating clean formal ‘representations in the head’. I’m pretty sure that in at least some cases these are the same people (CfAR and MIRI are still closely linked, and the MIRI forum still looks like ‘old school Less Wrong’ to me), so I’m just interested in how they resolve the tension!
I really don't understand how this side of LW fits with the side that's obsessed with formalising everything in some improbably narrow mathematical framework. They seem completely at odds.
Not sure exactly which parts of LW you are referring to when you’re talking about “formalizing everything in math”, but for some parts (e.g. anything to do with decision theory) at least, the answer is that it’s the LW/MIRI flavor of AI research. It’s meant to be something that you use for thinking about how to build an AI; it’s not meant to be a practical guide to life any more than a theoretical physicist’s work on string theory is meant to help him with his more mundane concerns.
I’m sure you realize that if you’re curious about CFAR, they do run workshops you could attend, right? ;) If the cost is the deciding factor, it’s negotiable. (It would probably be gauche to elaborate on the exact deal that I got, but let’s just say that I was pretty penniless when I contacted them about attending but we came to a mutually satisfactory arrangement anyway.)
Thanks for replying, and sorry, I know I was being vague. Yes, I’m talking about the general probability/logic/decision theory cluster that MIRI work within. This is still rather vague, but as I said in the comments to the recent SSC post I haven’t read any of their recent stuff and don’t know exactly what e.g. the logical induction paper is about. (I’d link to the specific comment but the spam filter here didn’t like that post; it’s under the same username.)
It’s meant to be something that you use for thinking about how to build an AI; it’s not meant to be a practical guide to life any more than a theoretical physicist’s work on string theory is meant to help him with his more mundane concerns.
My question is the other way round (I fleshed it out a bit more in that thread comment). Given the kind of high-level understanding of cognition we build up from things like instrumental rationality, what makes MIRI think that their strategy is a promising one for explaining it at an underlying level?
This is a genuine question where I’d like to know the answer; maybe I’d be convinced by it. FeepingCreature on the SSC thread said, essentially, ‘It’s something where we know how to do calculations, so we can come up with toy models’. That’s fine, but I’m interested in if there are any deeper reasons.
Sorry, my answer was not quite right. It’s not that MIRI is using logical approaches to figuring out how to build an AI. Rather, they are using logical approaches to figure out what we would want our AI to do.
A slightly analogous, established form of logic use can be found in the design of concurrent systems. As you may know, it’s surprisingly difficult to design software that has multiple concurrent processes manipulating the same data. You typically either screw up by letting the processes edit the same data at the same time or in the wrong order, or by having them wait for each other forever. (If not previously familiar, Google “dining philosophers problem” for a simple illustration.)
So to help reason more clearly about this kind of thing, people developed different forms of temporal logic that let them express in a maximally unambiguous form different desiderata that they have for the system. Temporal logic lets you express statements that say things like “if a process wants to have access to some resource, it will eventually enter a state where it has access to that resource”. You can then use temporal logic to figure out how exactly you want your system to behave, in order for it to do the things you want it to do and not run into any problems.
Building a logical model of how you want your system to behave is not the same thing as building the system. The logic only addresses one set of desiderata: there are many others it doesn’t address at all, like what you want the UI to be like and how to make the system efficient in terms of memory and processor use. It’s a model that you can use for a specific subset of your constraints, both for checking whether the finished system meets those constraints, and for building a system so that it’s maximally easy for it to meet those constraints. Although the model is not a whole solution, having the model at hand before you start writing all the concurrency code is going to make things a lot easier for you than if you didn’t have any clear idea of how you wanted the concurrent parts to work and were just winging it as you went.
So getting back to MIRI, they are basically trying to do something similar. Their work on decision theory, for instance, is not so much asking “how do we want the AI to make its decisions”, but rather “what do we think that ‘making the right decisions’ means, and what kinds of constraints do we think that includes”. Not asking “how do we make an AI generally intelligent”, but rather “if we did have a generally intelligent AI, what would we want it to do in the first place”, and then pinning those desiderata down in sufficiently precise terms so as to make them unambiguous.
As David correctly points out, mathematical logic is an unusable basis for building an intelligent system. But “how do we build an intelligent system” is a different question from what MIRI is asking - they are asking the question of “what would it even mean for an AI system to be aligned with our values”. And they are arguing - I think convincingly - that if you start building an AI system without having any clue of what kinds of constraints its design should fulfill, you are very unlikely to get them right. In a similar manner as the guy who starts coding up a concurrent system without having any formal understanding of what kinds of properties a concurrent system should have, is just going to have a horrible dysfunctional mess at their hands.
Or to take the climate change analogy. Rather than being the guys who revive phologiston theory and then try to apply that to global warming, MIRI is being more like economists who start applying economic theory to the question of “given that we’re going to have global warming, how’s it going to affect our economy and which of the proposed methods for dealing with it would be the best in economic terms” (only to be dismissed by climate researchers who say that economic models are completely unsuitable for making predictions about the global weather system, which is completely correct but also completely beside the point).
Kaj Sotala: Thanks, yes that is a useful distinction! I’ll have to think more about how much it helps answer my question, but it definitely makes things a bit clearer for me.
More on this subject, please
Excellent summary. Another perspective on the potential harm of this meme: religious Bayesian are easy to manipulate by massaging their first impression of a subject, as the practice enshrines initial bias by allowing only a simplistic + predictable family of recalibration
As someone who is part of this “community”, there is another term that is sometimes used, I believe coined by Scott Alexander, “x-rationality.” I like this better. You are focusing on the Bayesian part, which is admittedly a particular obession of Yudowsky. Yudowksy is eccentric and the cult issues have been thoroughly discussed. LW has changed over time as more diverse thinkers have been more influential. While LW has been incredibly influential, it is not the whole of “x-rationalism”, it is a sort of intellectual sub-culture that includes things like Scott Alexander’s blogs.
I would say the main unifying principle is the desire to be “less wrong.”
I disringuish this from “vulgar rationality” in the sense that the idea is to improve one’s own rationality, not “shoot fish in a barrel” which is common to many online groups that define themselves as rationalist. There are common interests, but there need not be universal agreement. Good faith conversations, understangind heuristics, there tends to be a preference for choice utlilitarianism but i don’t share it. Specific interest in AI, existential risk, how to update and calibrate one beliefs.
I feel like you are taking a very superficial view of this, literally looking at bayes equation, and seeing that as the whole of what it’s about.
I am a member of two x-rationalist groups, and I literally can’t even remember the last time someone discussed bayes or bayesian probablity. For EY, it was a key insight that did have a “religious” transformational experience.
What i think you also miss is that you look through the lens of your own model and see these things in primarily philosophical terms and perhaps don’t see the “what’s it like inside the equation” subjective power of what you would call “eternalism”- but which doesnt necessarily have the philosophical content you ascribe to it.
I would recommend looking into the phenomenon of temporal lobe epileptic and sub-eliptic expereinces and how they can induce “religious conversions”- that is “religion” which neither fits into the “choiceless” or “choice” modes neatly. It’s nature if more nebulous.
I think you have a lot of great insights, but you should perhaps apply the concept of “nebulosity” to your own models. They are useful modalities, but as you point out, not absolute truth nore devoid of meaning/ :)
I am sorry to see that your opion about acupuncture is plain wrong. There is a very large research base supporting Traditional Chinese Medicine nowadays, and while some obviously is of low quality, some of that which is available is is of the highest order.
See for example
(many more articles on PubMed available of course)
The World Health Organisation:
And even Harvard Medical School teaches a course in acupuncture nowadays: