Tuesday, 5 July 2022

Bayesian rabbit holes: Decoding conditional probability with non-commutative algebra


    The White Rabbit

A novice analyst or even experienced (data) scientist would have thought that the bar notation $|$ in representing conditional probability carries some different operational mathematics. Primarily when written in explicit distribution functions $p(x|y)$. Similar approach applies to joint probabilities such as $p(x, y)$ too. One could see a mixture of these, such as $p(x, y | z)$. In this short exposition, we clarify that none of these identifications within arguments of probability do have any different resulting operational meaning. 

Arguments in probabilities: Boolean statement and filtering 

Arguments in any probability are mathematical statements of discrete mathematics that correspond to events in the experimental setting. These are statements declaring some facts with a boolean outcome. These statements are queries to a data set. Such as, if the temperature is above $30$ degrees, $T > 30$. Temperature $T$ is a random variable. Unfortunately, the term random variable is often used differently in many textbooks. It is defined as a mapping rather than as a single variable. The bar $|$ in conditional probability $p(x|y)$, implies statement $x$ given that statement $y$ has already occurred, i.e., if. This interpretation implies that $y$ first occurred before $x$, but it doesn't imply that they are causally linked. The condition plays a role in filtering, a where clause in query languages. $p(x|y)$ boils down to $p_{y}(x)$, where the first statement $y$ is applied to the dataset before computing the probability on the remaining statement $x$.

In the case of joint probabilities $p(x, y)$, events co-occur, i.e., AND statement. In summary, anything in the argument of $p$ is written as a mathematical statement. In the case of assigning a distribution or a functional form to $p$, there is no particular role for conditionals or joints; the modelling approach sets an appropriate structure.

Conditioning does not imply casual direction: do-Calculus do

A filtering interpretation of conditional $p(x|y)$ does not imply causal direction, but $do$ operator does, $p(x|do(y))$. 

Non-commutative algebra: When frequentist are equivalent to Bayesian

Most of the simple filtering operations would result in identical results if reversed. $p(x|y) = p(y|x)$, prior being equal to posterior. This remark implies we can't apply Bayesian learning with commutative statements. We need non-commutative statements; as a result, one can do Bayesian learning with the newly arriving data, i.e., the arrival of new subjective evidence. The reason seems to be due to the frequentist nature of filtering.


Even though we provided some revelations on decoding the operational meaning of conditional probabilities, we suggested that any conditional, joint or any combination of these within the argument of probabilities has no operational purpose other than pre-processing steps. However, the philosophical and practical implications of probabilistic reasoning are always counterintuitive. Probabilistic reasoning is a complex problem computationally. From a causal inference perspective, we are better equipped to tackle these issues with do-Bayesian analysis.  

Further reading

No comments:

Post a Comment

(c) Copyright 2008-2020 Mehmet Suzen (suzen at acm dot org)

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.