Monday 5 December 2022

The conditional query fallacy: Applying Bayesian inference from discrete mathematics perspective

Preamble

    The Tilled Field,
Joan Miró
(Wikipedia)
One of the core concepts in data sciences is conditional probabilities, $p(x|y)$ appear as logical description of many of the tasks, such as formulating regression or as a core concept in Bayesian Inference. However, there is operationally no special meaning of a conditional or joint probabilities as their arguments are no more than a compositional event statements. This raise a question: Is there any fundamental relationship between Bayesian Inference and discrete mathematics that is practically relevant to us as practitioners? Since, both topics are based on discrete statements returning a Boolean variables. Unfortunately, the answer to this question is a rabbit hole and probably even an open research.  There is no clearly established connections between discrete mathematics fundamentals and Bayesian Inference.  

Statement mappings as definition of probability 

Statement is a logical description of some events, or set of events. Let's have a semi-formal description of such statements.

Definition: A mathematical or logical statement formed with boolean relationships $\mathscr{R}$ (conjunctions) among set of events $\mathscr{E}$,  so a statement $\mathbb{S}$ is formed with at least a tuple of $\langle \mathscr{R},  \mathscr{E} \rangle$. 

Relationships can be any binary operator and events could explain anything perceptional, i.e., a discretised existence. This is the core discrete mathematics and almost all problems in this domain formed in this setting from defining functions to graph theory. A probability is no exception and definition naturally follows, as so called statement mapping.

Definition: A probability $\mathbb{P}$ is a statement mapping, $\mathbb{P}: \mathbb{S} \rightarrow [0,1]$. 

The interpretation of this definitions that a logical statement is always True if probability is 1 and always False if it is 0. However, having conditionals based on this is not that clear cut.

Conditional Query Fallacy 

A non-commutative statement may imply, reversing the order of statements should not yield to the same filtered set on the data for Bayesian Inference. However, Bayes' theorem would have a fallacy for statement mappings for conditionals in this sense. 

Definition: The conditional query fallacy is defined as one can not update belief in probability, because reversing order of statements in conditional probabilities halts Bayes' update,  i.e., back to back query results into the same dataset for inference.

At first glance, this appears as a Bayes' rule does not support commutative property, practically posterior being equal to likelihood.  However, this fallacy appears to be a notational misdirection. Inference on the filtered dataset back to back constituting a conditional fallacy i.e., when a query language is used to filter data to get A|B and B|A yielding to the same dataset regardless of filtering order. 

Although, in inference with data, likelihood is actually not a conditional probability, strictly speaking and not a filtering operation. It is merely a measure of update rule. We compute likelihood by multiplying values obtained by i.i.d. samples inserted into conjugate prior, a distribution is involved. Hence, the likelihood computationally is not really a reversal of conditional as in $P(A|B)$ written as reversed, $P(B|A)$.  

Outlook

In computing conditional probabilities for Bayesian Inference, our primary assumption is that conditional probabilities; likelihood and posterior are not identical. Discrete mathematics only allows Bayesian updates, if time evolution is explicitly stated with non-commutative statements for conditionals.

Going back to our initial question, indeed there is a deep connection between the fundamentals of discrete mathematics and Bayesian belief update on events as logical statements. The fallacy sounds a trivial error in judgement but (un)fortunately goes into philosophical definitions of probability that simultaneous tracking of time and sample space is not encoded in any of the notations explicitly, making statement filtering definition of probability a bit shaky.

Glossary of concepts

Statement Mapping A given set of mathematical statements mapped into a domain of numbers.

Probability A statement mapping, where domain is $\mathscr{D} = [0,1]$.

Conditional query fallacy Differently put than the above definition. Thinking that two conditional probabilities as reversed statements of each other in Bayesian inference, yields to the same dataset regardless of time-ordering of the queries.

Notes and further reading

  • Fallacy is one computes $P(A|B)=P(B|A)$, while filtering results into identical datasets. Correction would be that, one needs to use different sample sizes for reverse statement or compute joints and marginals separately on their own filtered datasets. Use the first filtering sample size in computing the probability not the total.
  • Here, discrete mathematics we refer to appears within arguments of probability. The discussion of discrete parameter estimations are a different topic. Gelman discusses this, here.
  • Conjunction Fallacy
  • Probability Interpretations
  • Bayesian rabbit holes: Decoding conditional probability with non-commutative algebra M. Süzen (2022)
  • Holes in Bayesian Statistics Gelman-Yao (2021) : This is a beautifully written article. Specially, proposal that context dependence should be used instead of subjective

No comments:

Post a Comment

(c) Copyright 2008-2024 Mehmet Suzen (suzen at acm dot org)

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.