> *assigning probabilities to singular events is only meaningful and admissible ...

jonathanstrange · on June 5, 2019

No, I was not speaking from a Bayesian perspective, I was laying out the propensity-theoretic explanation of probability. The propensity explanation is one of attempts of explaining why singular events might be said to give rise to probabilities, living besides frequentism and Bayesianism. Another perspective worth mentioning is the logical approach, which is in the end purely combinatorial.

Some people think that you need to explain why a die can be fair, rather than just assuming it or only looking at it from a frequentist perspective. Of course, die-hard Bayesians don't think so, but that would be begging the question in the context of discussing criticisms of Bayesianism.

> Read the first 2 chapters of Probability Theory: the Logic of Science, by E. T. Jaynes: "Plausible reasoning" and "The quantitative rules". It's very accessible, and you shall see how strong the foundations really are.

I'm an expert on this topic. The only arguments for probabilism are Dutch book arguments, and there is a large number of arguments against these. See for example various articles by Hajek. Alternative representations of graded belief are, among others:

- plausibility theory (Halpern at al.)

- possibility theory (Dubois & Prade)

- Haas-Spohn ranking theory and variants thereof

- various notions of epistemic entrenchment

- Dempster-Shafer belief theory

- almost any quantitative or qualitative representation of belief in belief revision theory not covered by one of the above theories (e.g. belief update by Katsuno & Mendelsohn)

- by a general logical connection, nonmonotonic logics and AAFs can generally represent notions of belief update, such that the underlying qualitative ordering of states is a representation of graded belief

What you probably mean is that the above generalizations (or qualitative theories, in some cases) could be simulated with probabilities, e.g. by using convex sets of probabilities or what Josang is doing in his "subjective logic". That's true, but then we're no longer talking about probabilism in the sense I've used the word.

Of course, you can also try arguing for probabilism like Savage did: Lay out a set of postulates for your subjective plausibility that happen to allow you to proof that this notion of subjective plausibility is in the end probability. Despite the merits of such work, it is in the end a form of cheating (or "reverse engineering"), because you could just as well come up with plausible postulates that yield the weaker axioms of possibility theory.

loup-vaillant · on June 5, 2019

> No, I was not speaking from a Bayesian perspective, I was laying out the propensity-theoretic explanation of probability.

Unless you can explain this "propensity" in terms of actual physical properties, propensity by itself is… unjustified. The only domain I know of so far where we could possibly argue propensities are a thing is quantum mechanics. And even then it seems to rest on an anthropic argument: which universe am I living in?

> Some people think that you need to explain why a die can be fair,

A die by itself is not fair, right? A die might be balanced, and the way it is thrown it might have enough unpredictable variability to cause everyone in the room to think "uniform distribution over [1..6]".

Likewise, a cryptographic pseudo random generator is unpredictable (and thus "fair"), to anyone who doesn't know its internal state. Even though the process itself is deterministic, it's just not computationally feasible to guess its output just from the observation of past inputs. (Though for this one I'm relying on the fact we're not logically omniscient.)

> I'm an expert on this topic.

Good. Then you know that any inference strategy that falls prey to Dutch Books is not rational. Right?

To be fair, probability theory is not computationally tractable. I did not verify, but I guess any feasible approximation is vulnerable to some more or less subtle Dutch Books.

Now the way you talk about Dutch Books sound like all the other strategies you mention are vulnerable, not just in practice, but in theory as well. They are thus not perfectly rational. Do their authors at least have the grace to admit this is a flaw that should be corrected?

But then I suspect that correcting the flaw inevitably leads to probability theory itself: if you accept Jaynes three "desiderata" as required for any kind of rational reasoning, as he shows, the result is necessarily equivalent to probability theory as we know it (where probabilities are subjective assessments of plausibility, otherwise known as "degrees of belief").

I can only conclude that you do not accept Jayne's desiderata as necessary for correct inference. And this is the point where I look at you like you're not quite sane.

For reference, Jaynes Desiderata:

  (1) Degrees of plausibility are represented by real
      numbers. (And a continuity assumption.)

  (2) Qualitative correspondence with common sense.
      (explained in more detailed in the book)

  (3a) If a conclusion can be reasoned out in more than
       one way, then every possible way must lead to the
       same result.

  (3b) The robot always takes into account all of the
       evidence it has relevant to a question. It does
       not arbitrarily ignore some of the information,
       basing  its conclusions only on what remains. In
       other words, the robot is completely non
       ideological.

  (3c) The robot always represents equivalent states of
       knowledge by equivalent plausibility assignments.
       That is, if in two problems the robot’s state of
       knowledge is the same (except perhaps for the
       labeling of the propositions), then it must assign
       the same plausibilities in both.

Good luck convincing me (and I suspect, the majority of people, including frequentist statisticians), that we should reject any of these desiderata.

I don't care it's reverse engineering, those desiderata match the way I think. I accept the conclusion that probability theory is the correct (albeit intractable) way to think, because I ultimately agree with the postulates it rests on. Vehemently so. They're not just true, they're obvious.

If you don't accept them, then I can only give up, and remember what Yudkowsky once wrote: "How do you argue a rock into becoming a mind?"

jonathanstrange · on June 6, 2019

> Good. Then you know that any inference strategy that falls prey to Dutch Books is not rational. Right?

Do you even have an idea what "rational" means? There are people who argue that having cyclic preferences is not only rational, but even sometimes the only rational representation of evaluations. I'm not one of these, but just wanted to mention that things are not as simple as you lay them out.

If by "rational" you mean "fine for decision making", then I need to disappoint you. Dutch Books are not a working criterion for that. It is perfectly possible to make rational decisions with cyclic preferences. Your preferences need to weakly eligible and weak eligibility needs to be top-transitive (Hansson).

Weak eligibility: There are one or more alternatives such that there is no preferred alternative to them.

Top transitivity of weak eligibility: If a is weakly eligible and a~b, then b is also weakly eligible.

These are conditions on preferences. You can have similar conditions on subjective plausibility, of course, once you combine preferences and subjective plausibilities.

By the way, Expected Utility falls prey to Dutch Books. There is a money pump against every risk-averse or risk-seeking agent. Check out Wakker's book, which is much better than Jayne: Prospect Theory for risk and ambiguity. Anyway, EU is often considered rational and widely used, but according to your criterion it would be irrational. (In finance, the kind of Dutch Books are called "arbitrage" and exploited immediately, so the market prunes them away, but in other areas EU is used extensively. Are you maybe a finance guy???)

> For reference, Jaynes Desiderata:

Of course you can just claim "here is my list of postulates, and that's what 'rational' means", but that's not really an argument. The other theories I am talking about are also axiomatized. Take for example Fishburn's seminal work. According to your theory, Fishburn spent most of his life and efforts in decision making on irrational theories. I'm not convinced and rather be willing to talk about different kinds of rationality, if I'd be pressed to make a decision on that.

> (1) Degrees of plausibility are represented by real numbers. (And a continuity assumption.)

There is a vast array of literature on qualitative decision making for which this assumption does not hold. Lexicographic decision making does also not fulfill that requirement and there is a whole French-Belgium school on that, including axiomatizations and practical methods (tools like ELECTRE). For lexicographic decision making usually hyperreal numbers are used.

Qualitative decision making comes with a host of problems and limitations due to Arrow's Theorem, but lexicographic models can be very reasonable and even required if some of the authors in the field are right about some examples of seemingly irrational preferences. In any case, just to say that these axiomatized theories are irrational because "here are my axioms" is unacceptable. I'm sure not even Jayne does that.

As for the continuity assumption: There is a whole field of measurement theory that would tell you when you need it and when you don't need it, and I really don't see any non-measurement-theoretic way of defending such technical assumptions as rationality postulates independently. Again, just assuming these kind of things a bit too simple. After all, I can take any postulate and call it "rational", that's not a meaningful discussion of rationality, though.

> (3b) The robot always takes into account all of the evidence it has relevant to a question. It does not arbitrarily ignore some of the information, basing its conclusions only on what remains. In other words, the robot is completely non ideological.

This is an interesting principle, because even in probabilistic settings it completely controversial how to deal with conflicting evidence and how and when to revise beliefs in the face of evidence that directly conflicts with your existing beliefs.

It's a very vexing and complicated problem with many different solutions. It is definitely an underdetermined problem. One of the best discussions of it has evolved from criticisms of the corresponding update rule in the Dempster-Shafer theory of evidence, so it's worth taking a look at if you're really interested in this topic. But you seem to be hell-bent on taking Jayne's book as some sort of bible, which is weird. It's not as if any of the other approaches I've mentioned in my previous post are unknown or have been proposed by outsiders - it's almost impossible to not stumble across possibility theory (Dubois & Prade) or Halpern's work if you're doing AI research, for example.

> They're not just true, they're obvious.

Maybe for people who do not know the literature very well, but certainly not to me. Sorry. :(

loup-vaillant · on June 6, 2019

> There are people who argue that having cyclic preferences is not only rational, but even sometimes the only rational representation of evaluations.

Wouldn't be the first time otherwise serious people are defending nonsense. Noted nonetheless.

> It is perfectly possible to make rational decisions with cyclic preferences.

It is perfectly possible to make rational decision while being insane. Just, not all decisions will be rational. Cyclic preferences are not insane with respect to all decision, but they do mean the decision system as a whole is not flawless.

While the absence of cyclic preferences is of course not sufficient for perfect rationality, it's obviously required.

> By the way, Expected Utility falls prey to Dutch Books. There is a money pump against every risk-averse or risk-seeking agent.

Well, if you're not evaluating risks correctly to begin with, of course you're gonna get ripped off (I'm not saying that's a good thing). Being either risk seeking or risk averse looks like a flaw too, though perhaps less severe than cyclic preferences.

> There is a vast array of literature on qualitative decision making for which this assumption does not hold.

Wait a minute, this one is only talking about epistemology. Jaynes does not mention utility functions at all, and for all I know those may still be allowed to be discontinuous. (That would be perhaps a bit surprising, but I have yet to have an opinion on that particular point.)

Discontinuous probabilities, that would be more surprising. Though I reckon this continuity business is the weak link here. It would be nice if we didn't have to assume it.

> even in probabilistic settings it completely controversial how to deal with conflicting evidence and how and when to revise beliefs in the face of evidence that directly conflicts with your existing beliefs.

There are lots of reason why a piece of data might not change one's mind, even if that piece of data seems to contradict their beliefs directly. For instance that piece of evidence might have been cherry piked from a mass of otherwise normal data.

Not even acknowledging the piece of data might be a good approximation in some cases, but in general it seems quite foolish. You don't just ignore a piece of evidence, you explain why it doesn't change your mind. (I believe Jaynes gives examples of beliefs diverging when exposed to the same piece of evidence.)

> But you seem to be hell-bent on taking Jayne's book as some sort of bible, which is weird.

Call it confirmation bias, but when I read that book, I already subscribed to probability theory as the correct way to think. I had for a long time. The intuition of probabilities being degrees of beliefs, I had for as long as I can remember.

Then this book comes up, and provide justifications for my intuitions that were even stronger than I anticipated. It's like suspecting there's a giant bearded man behind that cloud, and then actually see it. And take a photo, and show it to your friends. Perhaps not foolproof, but pretty damn close.

---

Now we still have a problem. Jayne's Robot cannot exist. I mean, that would be something like AIXI, that's not tractable. Probability theory is not tractable (we wouldn't have Monte Carlo methods if it were, we'd just compute the probabilities directly). Any inference engine that runs in the real world (like humans), has to be imperfect. We have to take shortcuts, and from them, flawed reasoning will arise.

There's also the problem that thinking has a cost. It takes time and energy, and with those, utility. So not only a real engine will have flaws, it also needs to evaluate whether minimising those flaws is worth the trouble (and that evaluation also costs some thinking).

To take a concrete example, the first Alpha Go program lost one of its games in part because it failed to take more time in a particularly hard to evaluate game state. It was obvious to top human players that this particular move required more thought than usual, but the machine wasn't programmed that way.

As certain as I am that probability theory is the correct ideal to attain, I also have to admit that it is just that: an impossible ideal. How to instantiate that ideal into a good enough working implementation, I have no freaking clue.

jonathanstrange · on June 7, 2019

Well, I agree mostly with you, except that I'm not a probabilist. There is an extensive discussion about what and what not Dutch books show, see Alan Hájek's work on that, which is really worth reading.

> Being either risk seeking or risk averse looks like a flaw too, though perhaps less severe than cyclic preferences.

Yes, I don't want to deny that this view is appealing. However, even if you use probabilistic representations of degrees of belief, you need to deal with ignorance and conflicting evidence in one way or another. Convex sets of probabilities can be shown to be able to represent many of the alternative approaches I've mentioned. There is also this "subjective logic" by Jøsang that is surprisingly nice despite it's silly name. Check it out, maybe the only quirk I have with it is that he mostly seems to re-brand many prior ideas, but the framework is interesting.

> Not even acknowledging the piece of data might be a good approximation in some cases, but in general it seems quite foolish. You don't just ignore a piece of evidence, you explain why it doesn't change your mind.

I agree with you, but at the same time we know from qualitative belief revision theory that there are many, many ways of dealing with conflicting evidence. Okay, we can rule out some of them, e.g. discarding all previous beliefs to learn the new evidence, but among the many less obviously flawed methods a choice needs to be made. The probabilistic setting doesn't help too much in that area, it actually makes it harder to see what's going on. As I've said, the problem is underdetermined.

> Then this book comes up, and provide justifications for my intuitions that were even stronger than I anticipated.

I'm definitely going to read it! However, I might already be tainted by other books on the subject and philosophical discussions. I really do think a belief representation ought not be closed under negation, i.e., I have strong Dempster-Shafer intuitions, and that some way of distinguishing ignorance from doubt is needed.

> Call it confirmation bias, but when I read that book, I already subscribed to probability theory as the correct way to think. I had for a long time. The intuition of probabilities being degrees of beliefs, I had for as long as I can remember.

Kudos to you for having such strong intuitions. It makes life easier. Maybe I'd be willing to buy into them for probabilities, but that wouldn't help me because of similar problems on the evaluative side on which most of my work focuses. On the evaluative side we have thought experiments like Spectrum Cases (Temkin, Rachels): Suppose A gives you extremely high pleasure for a month, B gives you a little bit less pleasure than A (barely noticeable) for 3 months, C gives you a little bit less pleasure than B (barely noticeable) for 9 months, and so on. Some people (not all) have the intuition that B is better than A, C is better than B, and so forth, until at some point, say Z, they would judge that A is better than Z. These thought experiments come in all varieties, can also be made about well-being and other notions of goodness and can be made as realistic as one wishes. Most people who want to keep "better than" transitive introduce some notion of significance, which is lexicographic decision making in disguise (significant value attributes always outrank insignificant value attributes). But okay, you were talking about probabilities only and already acknowledged the evaluative component could be discontinuous. (To be more precise, in this case the Archimedean axiom fails.) It's just that even if graded belief is purely probabilistic, these kind of preferences will complicate making decisions on the basis of your belief.

I agree about the tractability, too. Since you are a probabilist about graded belief, that already makes your life much easier than mine, though. Couldn't you just say that any heuristics are permissible in certain circumstances as shortcuts that - under these circumstances - are conducive to adequate probability approximations?

I didn't want to insinuate that there is anything wrong with being a probabilist, it's in the end a matter of intuitions, I merely wanted to point out that there are some fairly well-known authors who are not probabilists about graded belief in the narrow sense, e.g. Bouyssou, Fishburn, Vincke, Pirlot in decision making, people like Halpern, Dubois, Prade, Spohn and their scholars in A.I., and of course almost everybody in mathematical psychology such as Luce and Tversky. But as I've said, most of their generalizations can be represented by more complicated probability representations such as sets of probabilities.

Anyway, it was nice chatting with you!

loup-vaillant · on June 7, 2019

> Well, I agree mostly with you, except that I'm not a probabilist.

Good enough for me. :-) (Aumann's agreement theorem notwithstanding, I have to recognise the capacity for actual humans to agree is limited.)

> I'm definitely going to read [Jayne's book]!

I have yet to read it all, but the foundations are laid out early. The preface mostly explains where the author is coming from, chapter 1 and 2 do most of the justifications. The rest focuses more on applications of probability theory. My general impression was like:

  Matches my intuitions,
  solid theoretical foundations,
  works in practice...
  ...case closed I guess.

> On the evaluative side we have thought experiments like Spectrum Cases (Temkin, Rachels): Suppose A gives you extremely high pleasure for a month, B gives you a little bit less pleasure than A (barely noticeable) for 3 months, C gives you a little bit less pleasure than B (barely noticeable) for 9 months, and so on.

Hmm, that's a hard one. Depending on the value I attach to pleasure, there should be 3 possibilities: A is best, Z (or whatever is the last iteration) is best, or there's a sweet spot in between. I get the circular preference, and would likely fall prey to it if the circle is hidden to me. But stuff like that is like a big warning sign that most probably requires more thought than a quick intuitive judgement.

> Couldn't you just say that any heuristics are permissible in certain circumstances as shortcuts that - under these circumstances - are conducive to adequate probability approximations?

I could. I didn't because once we start making shortcuts, evaluating the impact on the shortcut on the final assessment is very difficult. Probabilities are non-linear, it's very easy for a seemingly innocuous approximation to translate to snowball into a huge error. But if careful, it often can (and does) work.

It's a bit like floating point approximations. The ideal math on real numbers is correct, floating points only introduce small errors, but some operations (like a division close to zero) can magnify those errors from "approximate" to "utterly wrong". Special care is typically taken to ensure that does not happen.

> Anyway, it was nice chatting with you!

For me as well. I'll keep your references in mind, and thank you for your patience.