CHAPTER 7
AXIOMS
7.1. Here I formalize the informational approach through a propaedeutic system of logical axioms. The reasons why I speak of a propaedeutic system will be explained in §8.7.1 Obviously the variables "h", "h_{1}" et cetera range over pieces of information.
The axioms are
AX1 IDENTITY | if h_{}then h |
AX2 REPLACEMENT | if h_{1} and if h_{1}= h_{2 } then h_{2} |
AX3 ASSOCIABILITY | if h_{1}, and if h_{2} then h_{1}& h_{2} |
AX4 COMMUTABILITY | if h_{1}& h_{2}, then h_{2}& h_{1} |
AX5 RESTRICTION | if h_{1}&h_{2}, then h_{1} |
AX6 COHERENCE | if h_{1}&~h_{1}, then h_{2} |
if h_{1}&~h_{1}, then h_{2} | if h_{1}&~h_{1}, then h_{2} if h_{1}&h_{2}= h_{1}&~h_{1}_{ } then h_{1}&~h_{2}= h_{1 } |
and could be re-proposed in the well known fractional notation where for instance AX2 becomes
et cetera. I preferred "if then" only for minute typographical convenience.
7.1.1. The above axioms draw an idempotent logic (§1.3). In fact
Theor1. If h, then h&h and if h&h, then h.
Proof, By AX1 and AX3, if h then h&h; by AX5, if h&h, then h.
Obviously the theorem of idempotence concerns pieces of information. For instance it states that, if we know that 2<3, we can infer that 2<3 and 2<3; therefore, since reciprocally (AX5) if we know that 2<3 and 2<3, we can infer that 2<3, to know that 2<3 and 2<3 is nothing more and nothing less than to know that 2<3 (Theor4 below).
7.2. Some comments are opportune in order to show that the usual interpretation of symbols satisfies the admissibility criterion.
7.2.1. AX1 is obvious: any piece of information can be inferred from its assumption.
7.2.2. AX2 rules the substitution of identity. Indeed "substitution of identity" is a patent oxymoron, because in the meaning of "substitution" there is a component of diversity quite incompatible with the very meaning of "identity" (scholasticism taught: si duo idem faciunt, non est ideμ); this notwithstanding I respect the current terminology. Anyhow the topic will be better analyzed in Chapter 8.
7.2.2.1. Besides the substitution of identity, current theorizations list Modus Ponens as a further inference rule. Here, by the definition (6.vi), it is a theorem. In fact
Theor2. If h_{1 }and if h_{1 } = h_{1}& h_{2}, then h_{2}.
Proof. By AX2 we get h_{1}&h_{2}, then, by AX5, h_{2}.
Of course if "h_{1}" and "h_{2}" were variables over sentences,
h_{1}= h_{1}& h_{2}
would be an absurdity.
7.2.3. The admissibility of AX3 follows immediately from the same meaning of "and", that is from the same *conjunction* (when referred to pieces of information). The repetition of "if" says just that the two acquirements are singularly considered.
7.2.4. At first sight a superficial objection might suggest some perplexity about the admissibility of AX4, that is about the commutative property of conjunction. For instance
(7.i) Ava took a lover and Ava's husband abandoned her
and
(7.ii) Ava's husband abandoned her and Ava took a lover
are obtained by commutating the same two atomic statements, yet (7.i) and (7.ii) adduce two different pieces of information, otherwise it would be unexplainable why the respective lawyers are quarrelling about (7.i) and (7.ii).
The reply is immediate: (7.i) and (7.ii) are elliptic formulations suggesting that the consequentiality of the facts corresponds to the consequentiality of the atomic statements. Indeed, making (7.i) explicit leads to
(7.iii) Ava took a lover at t_{1} and Ava's husband abandoned her at t_{2}
exactly as making (7.ii) explicit leads to
(7.iv) Ava's husband abandoned her at t_{2 } and Ava took a lover at t_{3}
therefore the example, far from confuting AX4, corroborates it (both (7.iii) and (7.iv) are perfectly commutative).
7.2.5. While AX2, AX3 and AX4 do satisfy the admissibility criterion also under the dual interpretation which reads "&" as a symbol of inclusive disjunction, AX5 does not, so rejecting this dual interpretation. On the contrary AX5 legitimates in the most obvious way the usual interpretation of "&": in fact while a conjunction increases the information adduced by each conjunct, the inclusive disjunction decreases the information adduced by each disjunct.
7.2.6. AX6 is nothing but the formalization of the classical Ex absurdo quodlibet. In other words it states that the conjunction of two opposite pieces of information precludes every possible alternative.
7.2.7. AX7 states that two opposite pieces of information do not intersect. Therefore AX6 and AX7 in conjunction define exactly the complementary import of *~* which finds in ® its immediate visualization: the shaded fields of two opposite pieces of information do not leave any virgin sector and do not overlap.
7.3. I recall the definition
(7.v) (h_{1}⊃h_{2}) = (h_{1}& h_{2}= h_{1})
(§6.6.1); I also recall that under (7.v) h_{1} is an expansion of h_{2} and h_{2} is a restriction of h_{1}.
7.4. Here I list some theorems involving only conjunctions, that is only AX1, AX2, AX3, AX4 and AX5.
Theor3 h_{1}&h_{2 }⊃ h_{1}
Proof. By AX4 h_{1}&h_{2 }& h_{1}= h_{1}& h_{1}&h_{2}
By Theor1 h_{1}&h_{1}& h_{2}= h_{1}&h_{2}
By (7.v) h_{1}&h_{2 }⊃ h_{1}
7.4.1. For the sake of concision the proofs of the theorems below are simply sketched or even omitted (when quite elementary).
Theor4 If h_{1}⊃ h_{2} and h_{2}⊃ h_{1}, then h_{1}= h_{2}
Proof: h_{1}=h_{1}&h_{2}=h_{2}&h_{1}= h_{2}.
7.4.1.1. Since (Theor1) if h, then h&h and if h&h, then h, it follows from Theor4
Corollary 4 h&h = h
that is the identity between a piece of information and, so to say, its iteration. Yet we must avoid interpreting such a conclusion in an abusive way. A little example. Bob looks at the outcome of this die, and sees a six; yet he is astigmatic, and as such a little doubt remains: a six or a four? He puts on his spectacles and verifies: surely a six. In this sense someone could object that, since the ‘spectacles-assisted' acquirement strengthens the first one, Corollary 4 is violated.
No violation indeed, since the little doubt concerning Bob's first acquirement forbids its acceptation as an acquirement. When we accept a piece of information, we assume it as an unobjectionable datum which no further acquirement can strengthen. That is: here we are theorizing an idempotent logic where no intermediate degree of knowledge is admitted between *known* and *unknown*.
7.4.2. Some other theorems involving only conjunctions.
Theor5 If h_{1}= h_{2}, then h_{1}⊃ h_{2} and h_{2}⊃ h_{1}
Proof: h_{1}= h_{1}&h_{1}= h_{1}&h_{2}, ergo h_{1}⊃ h_{2};
h_{2}= h_{2}&h_{2}= h_{2}&h_{1}, ergo h_{2}⊃ h_{1}.
Theor6 If h_{1}⊃ h_{2} and h_{2}⊃ h_{3}, then h_{1}⊃ h_{3} (Transitivity)
Proof: h_{1}= h_{1}&h_{2, } h_{2}= h_{2}&h_{3};
h_{1}&h_{2} &h_{3}= h_{1}&h_{3}= h_{1}&h_{2} = h_{1}
Theor7 If h_{1}⊃ h_{2} and h_{1}⊃ h_{3}, then h_{1}⊃(h_{2}&h_{3})
Theor8 If h_{1}&h_{2}= h_{3}, then h_{3}&h_{1}= h_{3}
Proof: h_{1}&h_{2}&h_{3}= h_{3}&h_{3}= h_{3}=h_{1}& h_{1}& h_{2}&h_{3}= h_{1}&h_{3}&h_{3}= h_{1}&h_{3}.
7.4-3. While an implication where
(h_{1}⊃ h_{2}) & ~( h_{2}⊃ h_{1})
is called "simple implication", an implication where
(h1⊃ h2) & (h2⊃ h1)
is called "reciprocal implication". Then Theor4 and Theor5 say that identity is nothing but a reciprocal implication.
Though identity is a relation linking every category of referents, Theor4 and Theor5 are not so ambitious as to state that any identity can always be interpreted as a reciprocal implication. Since such theorems concern pieces of information, they simply state that, when we are dealing with pieces of information, identity can be conceived as a reciprocal implication and vice versa.
7.5. Some theorems involving also negations, that is also AX6 and AX7.
Theor9. (h_{1}&~h_{1}) = ( h_{2}&~h_{2})
Proof. AX6: (h_{1}&~h_{1}) ⊃ h_{2}. AX6: (h_{1}&~h_{1}) ⊃ ~ h_{2}. Theor7: (h_{1}&~h_{1}) ⊃ (h_{2}&~h_{2}).
Reciprocally (h_{2}&~h_{2}) ⊃ (h_{1}&~h_{1}). Ergo for Theor4 (h_{1}&~h_{1}) = (h_{2}&~h_{2}).
Theor10 If (h_{1}&h_{2}= h_{1}) then (h_{1}&~h_{2}=⊥)
Proof.... h_{2}&~h_{2}=⊥ ; h_{1}&h_{2}&~h_{2}=⊥ by AX6; ergo, if (h_{1}&h_{2}= h_{1}), then by substitution h_{1}&~h_{2}=⊥.
Theor11 If (h_{1}&h_{2}=⊥) then (h_{1}&~h_{2}= h_{1})
Proof. AX7 and definition of incoherence.
Theor12 If h_{1}⊃ h_{2} then ~h_{2 } ⊃ ~h_{1}
(Modus Tollens).
Proof Theor10: h_{1}&~h_{2}=⊥. AX4: ~h_{2}&h_{1}=⊥. Theor11: ~h_{2}&~h_{1}=~h_{2}.
Theor13 h&~~h = h
Proof By definition h &~h =⊥. Theor11: h &~~h = h.
Theor14 h &~~h = ~~h
Proof h=~h_{1.}. Theor13: h_{1} ⊃ ~~h_{1}; Modus Tollens ~~~h_{1} ⊃~h_{1}; ~~h ⊃h.
Theor15 h =~~h
Theor16 If h_{1}⊃ h_{2} and h_{3}⊃~ h_{2}, then h_{3}⊃~ h_{1}
Corollary 16 If h_{1}⊃ h_{2} and h_{3}⊃~ h_{2}, then h_{1}⊃~ h_{3}
7.6. The theorems of the propositional calculus can be easily proved. For instance (Kleene 1974, §23)
Implication (introduction): ((h_{1}&h_{2})⊃ h_{3}) ⊃ (h_{1}⊃(~ (h_{2}&~h_{3}))
Proof. (h_{1}&h_{2})&h_{3}= h_{1}&h_{2}; (h_{1}&h_{2})& ~h_{3}= h_{1}&(h_{2}&~h_{3})=⊥.Theor11: h_{1}&(~( h_{2}&~h_{3})= h_{1}.
Modus tollendo ponens. (~ (~h_{1}&~h_{2})& ~h_{1})⊃ h_{2}.
Proof. (~ (~h_{1}&~h_{2})& ~h_{1})& ~h_{2}=~ (~h_{1}&~h_{2})&( ~h_{1}&~i_{2})= ⊥.
and so on.
7.6.1. The reductio ad absurduμ is nothing but a technique derivable from the axioms. In fact, given a statute k, if h is such that k&h=⊥, by Theor11 k&~h=k, therefore ~h is implied by the statute.
7.7. The definition
(7.vi) (h_{1}-h_{2} = h_{3}) = (h_{1}=h_{2 } & h_{3})
introduces the symbol "-" for the operation I call "ablation".
The second member of (7.vi) tells us that an ablation is defined only where the minuend h_{1} implies the subtrahend h_{2}: in fact if h_{1}=h_{2 } & h_{3}, since h_{2 } & h_{3}=h_{2 } & h_{2}& h_{3}, then h_{1}&h_{2 } = h_{1}. And actually we cannot cancel from a statute a piece of information which does not belong to the same statute.
7.7.1. In spite of the manifest similarities connecting *sum* with *conjunction* and *subtraction* with *ablation*, while in mathematics (n_{1}-n_{2}) = (n_{1}+(-n_{2}) and therefore a subtraction can be conceived as the sum of a negative quantity, in the informational logic, evidently, (h_{1}-h_{2}) = (h_{1}&(~h_{2})) does not hold, since cancelling a piece of information is far from stating its opposite (forgetting that Bob loves Ava is far from believing that Bob does not love Ava). Therefore the ablation cannot be conceived as the conjunction of a negative piece of information.
In our minute practice ablations are mainly involved in situations where an absolutely trustworthy new acquirement incompatible with our previous statute constrains us to correct the same statute, therefore to reject (to ablate) some previously accepted pieces of information.
This topic will be deepened in due course. Here I only emphasize that conjoining pieces of information continues being an increasing operation (Theor3).
7.8. The three axioms for μ (§6.4) are
AX8 POSITIVITY
If (~ (k&h =⊥)) then (μ_{k}(h)>0)
AX9 OPPOSITION
μ_{k}(h) + μ_{k}(~h) = μ_{k}(k)
AX10 CONDIZIONALIZATION
μ_{k}(h_{1}& h_{2}) = μ_{k&h1}(h_{2})
(AX10 is so called for it is the father of the Principle of Condizionalization).
Since a measure is a number, the relations among measures are relations among numbers. This entails the presence of mathematical symbols such as ">", "0", "+", and also of other commonplace ones such as "/" for the division et cetera. As far as I know, the theories of probability omit the axiomatization of mathematics, and I follow this procedure.
7.8.1. The symbol "-" occurring below does not mean ablation; it does mean subtraction. This notwithstanding such a symbol is not a homonymy bearer because the context overcomes any ambiguity (where "-" connects numbers, it expresses subtraction, where "-" connects pieces of information, it expresses ablation).
7.8.2. Since μ_{k}(h) is represented in ® by the k&h-virgin field, the admissibility of the three axioms for μ is diagrammatically immediate. For instance AX8 states that if the k&h-shaded field is not the whole circle, the respective virgin field is not null. Anyhow the diagrammatic interpretation of some theorems is proposed below.
7.9. Contrary to the already listed theorems, indicated by "Theor", the μ-theorems (that is the theorems depending also on the non-logical axioms AX8, AX9 and AX10 will be indicated by "THEOR" in order to facilitate immediate references. The formal discriminating factor between Theors and THEORS, obviously, is the absence or presence of "μ".
THEOR1 μ_{k}(k&h) = μ_{k}(h) = μ_{k&h}(k&h)= μ_{k&h}(h)
Proof AX10: μ_{k}(k&h) = μ_{k&k}(h)= μ_{k}(h)
and so on.
Of course a derivation like
μ_{k}(h)=μ_{k&h}(h)=μ_{k&h}(k&h)= μ_{h}(k&h)= μ_{h}(k)
is illegitimate because presupposing that the sufficiency condition (§6.4.2) is satisfied by μ_{k}(h) does not imply that such a condition is satisfied by μ_{h}(k) too (and actually, in general, it is not).
THEOR2 μ_{k}(~k) = 0.
Proof. AX9: μ_{k}(k) + μ_{k}(~k) = μ_{k}(k).
THEOR3 μ_{k&h1}(h_{2})= μ_{k&h2}(h_{1}).
THEOR4
μ_{k}(h_{1}&h_{2}) + μ_{k}(h_{1}&~h_{2}) = μ_{k}(h_{1}).
Proof AX9 and THEOR1:
μ_{k&h1}(h_{2}) + μ_{k&h1}(~h_{2}) = μ_{k&h1}(k&h_{1}) ) = μ_{k}(h_{1}).
Equivalent formulation: THEOR4'
μ_{k}(h_{1}&h_{2}) = μ_{k}(h_{1}) - μ_{k&h1}(~h_{2})
THEOR4 shows that, given a k, increasing a hypothesis decreases its k-measure.
THEOR5 μ_{k}(h&~h) = 0
Proof THEOR4 and Theor15:
μ_{k}(h&~h) = μ_{k}(h) - μ_{k&h}(~~h) =μ_{k}(h) - μ_{k}(h).
Equivalent formulations: THEOR5' μ_{k}(⊥)=0
THEOR5" μ_{k&h}(~h) = 0.
Diagrammatic interpretation. Since the representation of two opposite pieces of information shades the whole circle, its virgin field, quite independently of the statute, is null.
THEOR6 μ_{k}(Ø)=μ_{k}(k).
Proof Ø=~⊥. AX9: μ_{k}(Ø)+μ_{k}(⊥) = μ_{k}(k). THEOR5': μ_{k}(⊥)=0.
THEOR7 If (μ_{k}(h)=0) then (k&h=⊥).
Proof Modus Tollens on AX8.
THEOR8 If (k&h=⊥) then (μ_{k}(h)=0).
Until now k=⊥ has only been a particular case, since THEOR8 shows that an incoherent statute entails a null measure for any hypothesis. In §7.13 this point will be deepened.
THEOR9 If (k⊃h) then (μ_{k}(h)=μ_{k}(k)).
Proof Protasis: k&h=k. Substitution: μ_{k}(k&h) = μ_{k}(k). THEOR1: μ_{k}(k&h) = μ_{k}(h)
Diagrammatic interpretation: if the shaded field of k includes the whole shaded field of h, the virgin field of k&h is the virgin field of k
THEOR10 0 ≤ μ_{k}(h) ≤ μ_{k}(k).
Proof AX8 and THEOR5' as for 0 ≤ μ_{k}(h); therefore (AX9) μ_{k}(h) ≤ μ_{k}(k)
Diagrammatic interpretation: the k&h-virgin field cannot be greater than the k–virgin field.
THEOR11 μ_{k}(h_{1}&h_{2}) ≤ μ_{k}(h_{2}).
THEOR12 If μ_{k}(h_{1}&h_{2})= μ_{k}(h_{1}) then μ_{k}(k&~ (h_{1}&~h_{2})) = μ_{k}(k).
Proof If μ_{K}(h_{1}&h_{2})=μ_{K}(h_{1}) then μ_{K}(h_{1}&~h_{2})=0 by THEOR4, ergo μ_{k}(~(h_{1}&¬h_{2}))=μ_{k } (k) by AX9.
THEOR12 is the μ–correspondent of the Deduction Theorem.
THEOR13 If (μ_{k}(h) = μ_{k}(k)) then (k⊃ h))
Proof AX9: μ_{k}(~h)=0. THEOR7: k&~h =⊥. Theor11, k&h =k.
THEOR14 (~(k⊃h)) ⊃ (μ_{k}(h) < μ_{k}(k)).
Proof Modus Tollens on THEOR13, and THEOR9.
Diagrammatical interpretation. If the representation of h shades some k-virgin sector, the k&h-virgin field is less than the k-virgin one.
7.10. The argument proposed in §6.10 can be re-proposed here. Since (AX9)
(7.vii) μ_{k}(~h) = μ_{k}(k) - μ_{k}(h)
and (THEOR4')
(7.viii) μ_{k}(h_{1}&h_{2}) = μ_{k}(h_{1}) - μ_{k&h1}(~h_{2})
give us the measures of negations and conjunctions, and since any propositional connective can be formulated in terms of negations and conjunctions, we can derive from (7.vii) and (7.viii) the measure of any piece of information adduced by a proper formula of the propositional calculus. Let me enter into some (useful) detail.
7.10.1. The three disjunctions informally introduced in §6.10 can be formally introduced by
(ix) (h_{1}∨h_{2}∨…∨h_{n}) = (~(~h_{1}&~h_{2}&…&~h_{n}))
for the inclusive disjunction OR, by
(h_{1}| h_{2}|...| h_{n}) = …
= (~(h_{1}&h_{2}) & ~(h_{1}&h_{3}) & ... & ~(h_{1}&h_{n}) & & ~(h_{2}&h_{3}) &…& ~(h_{2}&h_{n}) & ... & ~(h_{n-1}&h_{n}))
for the exclusive disjunction NAND, and by
(h_{1}↓ h_{2}↓…↓ h_{n})=((h_{1}∨h_{2}∨...∨h_{n}) & (h_{1}| h_{2}|...| h_{n}))
for the partitive disjunction XOR. The last disjunction is called "partitive" because an n-uple h_{1}.... h_{n} constitutes a proper partition of the k-compatible eventualities iff the two conditions
(k&~h_{1}&~h_{2}&....&~h_{n}) =⊥
(stating that h_{1}.... h_{n} are inclusively disjoined) and
(~ (h_{i}= h_{j})) ⊃ (k&h_{i}&h_{j}=⊥)
(stating that h_{1}.... h_{n} are exclusively disjoined) are satisfied.
Accordingly "↓"could almost be read as a synthesis of "∨" and "|".
7.10.2. As for the measures of the disjunctions I limit myself to the following theorems.
THEOR15 μ_{k}(h_{1}∨h_{2}) = μ_{k}(h_{1}) + μ_{k}(h_{2}) - μ_{k}(h_{1}&h_{2})
Proof By (7.vii) and AX9,
μ_{k}(h_{1}∨h_{2}) =μ_{k}(~(~h_{1}&~h_{2})) = μ_{k}(k) - μ_{k}(~h_{1}&~h_{2})
= μ_{k}(k) - μ_{k}(~h_{1}) + μ_{k&¬h1}(~~h_{2})
= μ_{k}(h_{1}) + μ_{k}(h_{2}&~h_{1})
= μ_{k} (h_{1}) + μ_{k}(h_{2}) - μ_{k}(h_{1}&h_{2})
THEOR16 μ_{k}(h_{1}| h_{2})= μ_{k}(~h_{1})+ μ_{k&h1}(~h_{2})
Proof μ_{k}(h_{1}| h_{2})= μ_{k}(~(h_{1}& h_{2}) = μ_{k}(k)- μ_{k}(h_{1}&h_{2}). Then THEOR4 and AX9.
THEOR17 μ_{k}(h_{1}↓ h_{2}) = μ_{k}(h_{1}&~h_{2}) + μ_{k}(h_{2}&~h_{1})
Proof μ_{k}(h_{1}↓ h_{2}) = μ_{k}((h_{1}∨h_{2}) & (h_{1}|h_{2})) = μ_{k}(h_{1}∨h_{2}) - μ_{k&(h1∨h2)}(~ h_{1}| h_{2}))
= μ_{k}(h_{1}∨h_{2}) - μ_{k&(h1∨h2)}(h_{1}&h_{2})
= μ_{k}(h_{1}) + μ_{k&h2}(~h_{1}) - μ_{k&(h1∨h2)}(h_{1}) + μ_{k&(h1∨h2)&h1}(¬h_{2})
= μ_{k}(h_{1}) + μ_{k&h2}(~h_{1}) - μ_{k}(h_{1}&(h_{1}∨h_{2})) + μ_{k&(h1∨h2)&h1}(~h_{2})
But, since h_{1}⊃(h_{1}∨ h_{2}), m(h_{1}&(h_{1}∨h_{2})) = μ_{k}(h_{1}) and μ_{k&(h1∨h2)&h1}(~h_{2}) = μ_{k&h1}(~h_{2}),
ergo μ_{k}(h_{1}↓ h_{2}) = μ_{k&h2}(~h_{1}) +μ_{k&h1}(~h_{2}) = μ_{k} (h_{1}&~h_{2}) + μ_{k}(h_{2}&~h_{1})
Equivalent formulation:
THEOR17' μ_{k}(h_{1}↓ h_{2}) = μ_{k}(h_{1})-μ_{k}(h_{1}&h_{2})+μ_{k}(h_{2})- μ_{k}(h_{1}&h_{2})
THEOR18 μ_{k}(h_{1}↓ h_{2}↓…↓ h_{n}) = μ_{K}(h_{1}&~h_{2}&...&~h_{n}) +
+ μ_{k}(h_{2}&~h_{1}&...&~h_{n}) + ...+
+ μ_{k}(h_{n}&~h_{1}&~h_{2}&….&~h_{n-1})
THEOR19 If (k=(h_{1}↓ h_{2}↓…↓ h_{n})), then
then μ_{K}(h_{1}↓ h_{2}↓…↓ h_{n}) = μ_{k}(h_{1}) + μ_{k}(h_{2}) + ... + μ_{k}(h_{n}) (Addictiveness)
Proof. If h_{1}... h_{n} are a proper partition of k
h_{1} = (~h_{2}&~h_{3}&...&~h_{n})
h_{2} = (~h_{1}&~h_{3}&...&~h_{n})
and so on, therefore
μ_{k}(h_{1}&~h_{2}&...&~h_{n}) = μ_{k}(h_{1})
μ_{k}(h_{2}&~h_{1}&...&~h_{n}) + μ_{k}(h_{2})
and so on; the addictiveness follows from THEOR18.
7.10.3. A dilemma is a partitive disjunction between two (opposite) alternatives. Therefore, with reference to a dilemma (n=2, h_{2}=~h_{1})
(7.x) μ_{k}(h_{1}∨h_{2}) = μ_{k}(h_{1}| h_{2}) = μ_{k}(h_{1}↓ h_{2}) = μ_{k}(h_{1}) + μ_{k}(h_{2}) = μ_{k}(k)
follows from THEOR15, THEOR 16 and THEOR17. Indeed (7.x) can shed light upon some poor uses of the various disjunctions. An example:
if h_{1}∨ h_{2}∨…∨ h_{n} and h_{i} entails ~h_{j} for all i¹j, then...
is a habitual expression meaning what can be advantageously meant by
If h_{1}↓ h_{2}↓…↓ h_{n}, then...
since the condition that h_{i} entails ~h_{j} for all i¹j is exactly the condition that the n inclusively disjoined pieces of information are also exclusively disjoined (an example in the formulation of THEOR33 below).
7.11. The definition
(7.xi) (h_{1}→h_{2}) = ~( h_{1}&~ h_{2})
introduces the (obviously non-primitive) symbol "→" for the connections I call "pseudo-hypothetics" (pseudo-hypothetics will be exhaustively analyzed in Chapter 14, specifically destined to conditionals). And
THEOR20 μ_{k}(h_{1}→h_{2}) = μ_{k}(~h_{1}) + μ_{k}(h_{1}& h_{2})
(whose proof follows plainly from (7.xi)) gives us the measure of pseudo-hypothetics.
7.12. In §6.3.3 I affirmed that μ is strictly linked to the notion Waismann, Carnap and followers call "measure". In fact by
(7.xii) P(h|k) = μ_{k}(h)/ μ_{k}(k)
I define the probability of h given k. The main difference is that, contrary to them, conceiving the measure as an absolute or unconditional quantity, in my opinion, is an unsustainable thesis (§15.1.2).
Of course the real number that μ assigns to a h as to a k° depends also on the unity we choose for μ ; I will deal with this marginal aspect in §7.15.
7.13. On the grounds of (7.xii) ~(k=⊥) becomes a strict condition in order to avoid μ_{k}(k)=0, that is a null denominator. Informally I remark that actually the same notion of probability would be senseless when referred to an incoherent statute.
Although the main stream of the contemporary orthodoxy starts from a monadic (absolute, unconditional) probability and (either by a definition or by an axiom) introduces the dyadic (relative, conditional) probability as a ratio between two monadic ones, in my opinion no monadic probability can exist, since the probability of a hypothesis depends intrinsically on the statute the same hypothesis is referred to. And the same (7.xii) makes evident such a claim.
Yet I am far from criticizing the distinction between prior and posterior probabilities. Such a distinction is a correct and essential achievement focusing on the variations determined upon the respective probabilistic values by increments of information. I am claiming that the same notion of an absolute probability is even more insensate than the already criticized notion of an absolute measure, because P would result dyadic even if m were monadic. The trap, generally speaking, is that while an ‘unconditional' probability deals with only one statute (the prior), a ‘conditional' probability deals with two (the prior and the posterior); nevertheless this evident difference must be not mistaken for an untenable difference of valences. In both cases the function concerns two variables (hypothesis and statute), and only in the latter must two different values of the second variable be accounted for. Thus a sound formalization must always deal with a dyadic function P(h|k) and recognize that there are probability problems concerning single values of such a function, and probability problems concerning the two values corresponding to the prior and to the posterior statute (prior and posterior as for the increment of information, obviously). Symbolically: there are probability problems concerning
(7.xiii) P(h|k°)
and probability problems concerning the relation between
P(h|k°&k')
and (7.xiii).
7.13.1. Moreover, a purely dimensional consideration suggests that to define the conditional probability as a ratio between two absolute probabilities is a rather arrogant procedure. Its arrogance does not regard the legitimacy of defining the values of a dyadic function as a ratio between the values of two monadic functions; it does regard the legitimacy of identifying the thus defined dyadic function with the definiens one. I intend that while a definition like
B(x,y) = A(x)/A(y)
is a formally unexceptionable procedure, a definition like
A(x,y) = A(x)/A(y)
is at the very least an insidious one. A minute example. If A(x) is a monadic function assigning to every person x his/her wealth (computed, say, in US-dollars), the ratio A(x)/A(y) defines a new dyadic function whose values are no longer US-dollars, but pure numbers (for instance: since x is three times richer than y, the ratio between the wealth of x and the wealth of y is 3); therefore it would not be correct to use the same "A" to indicate this new function too. Of course the values of both conditional and unconditional probabilities are pure numbers, but this particularity might be only the mask of a logical abuse: in fact a probability value and a ratio between two probability values continue being two heterogeneous entities, and this heterogeneity cannot at all mirror the real situation, as "probability" keeps a common meaning both in its ‘unconditional' and in its ‘conditional' applications.
7.13.2. Perhaps someone might be tempted to object that the notion of monadic probability is not at all insensate since, for instance, when we roll a just purchased die, 1/6 is the absolute value of each outcome.
The reply is easy. This value is not absolute, it actually depends on the cognitive endowment implicitly transmitted by "(purchased) die", since statistical elements suggest us to interpret the word as "perfectly balanced cube". In fact, almost in their totality, the dice on sale are perfectly balanced cubes. But if the die has been purchased on line at www.Falsaria.coμ, renowned firm specialized in the construction of deceitful dice, the probability of the various outcomes can be assigned only on the ground of a statute partitioning the space of possibilities in some non-uniform way. So 1/6 is only the conditional probability relative to a statistically privileged statute.
Dissembling the intrinsic relational nature of a quantity by implicit assumptions is an unsound yet frequent habit. For instance I am reading that the distance of Proxima Centauri is 4.2 light-years, but for sure my perfect understanding does not entail that the distance is a monadic notion, it simply means that it is a dyadic notion whose second term (our planet) is tacitly understood. And actually such elliptic ways of speaking are only fully legitimated where the context clearly privileges the second term.
7.14. The above considerations show that a Kolmogorov-style axiomatization is not acceptable; in fact
- it proposes the probability as a primitive notion
- it starts from a monadic absolute probability
- it grounds upon a set-theoretic approach whose exasperated extensionality and homonymy-blindness are quite unfit to account for our very gnosiology, often led by intensional processes.
7.15. As for the unity of measure,
(7.xiv) μ_{k }(k)=1
is the explicit choice complying with the implicit assumptions of the canonical approaches to the measure function. But I am afraid that (7.xiv) also represents the worst choice: in fact, under it, the same number expresses both μ_{k}(h) and P(h|k), a coincidence that seems to me an awkward help to possible mistakes between two quite distinct quantities.
Hajek (2003, §1) claims that the non-negativity and normalization axioms here resumed in
(7.xv) 0 ≤ P(h|k) ≤ 1
are largely matter of convention. I disagree: what is largely matter of convention is the choice of the μ-unit, but (7.xii) shows that this choice is irrelevant on the P-range, which always will respect (7.xv). In fact, while both μ_{k}(h)<0 and μ_{k}(h)>μ_{k}(k) are incoherent values, both μ_{k}(h)=0 and a μ_{k}(h)=μ_{k}(k) are exactly the coherent values accounting for the two border cases (of a h respectively k-incompatible and k-implied).
7.16. The well known probability theorems are derivable by the simple application of (7.xii) to the respective axioms and theorems for μ. In particular
THEOR21 P(h|k) + P(~h|k) = 1
follows directly from AX9,
THEOR22 0≤P(h|k)≤1
follows directly from THEOR9,
THEOR23 If h_{1}&h_{2}=⊥ then P(h_{1}∨ h_{2}|k)=P(h_{1}|k)+P(h_{2}|k)
follows directly from THEOR15.
Finally, in order to adequate my formulae to the current ones, let me use "e" as a new variable ranging over ‘evidences', that is on acquirements increasing a basic statute k°):
THEOR24 P(h_{1}|k°&e) = P(h_{1}&e|k°) / P(e|k°)
Proof. To substitute k with k°&e in (7.xii), to divide numerator and denominator by μ_{k°}(k°) and to simplify.
These four theorems express in dyadic notation the four axioms upon which the usual theories are normally based (Howson and Urbach 2006, §2a); thence the usual theorems could be considered as already proved. Yet I carried out the task of deriving them again not only because of the new dyadic notation, but also because of the formal compromises affecting some current proofs. For example
AXIOM P(t)=1 if t is a logical truth
THEOREM P(⊥)=0
Proof ~⊥ is a logical truth, hence...
(ibidem, §2.b, (6)) seems to me a rather rough argument. No doubt that ~⊥ is a logical truth, but this means only that a theorization entailing such a conclusion is admissible. A formal proof must start from some axiomatic formula and must transform it into the theoremic one with only the help of the inference rule(s); as such, maintaining the example, no proof can appeal to the notion of logical truth before its formal definition within the system, and no expression can be assumed as a logical truth before its formal derivation.
7.16.1. Other useful probability theorems are
THEOR25 P(h_{1}&h_{2}|k°&e) = P(h_{1}|k°&e)P(h_{2}|k°&e&h_{1})
Proof P(h_{1}& h_{2}|k°&e) = μ_{k°&e}(h_{1}&h_{2}) / μ_{k°&e}(k°&e)
= (μ_{k°&e}(h_{1}&h_{2})/μ_{k°&e}(h_{1}))(μ_{k°&e}
= (μ_{k°&e&h1}(h_{2})/ μ_{k°&e&h1}(k°&e&h_{1}))(μ_{k°&e}(h_{1})/ μ_{k°&e}(k°&e))
= P(h_{1}|k°&e)P(h_{2}|k°&e&h_{1})
COROLLARY 25 P(h_{1}&h_{2}|k) = P(h_{1}|k)P(h_{2}|k& h_{1})
THEOR26 P(h_{1}∨ h_{2}|k) = P(h_{1}|k) + P(h_{2}|k) – P((h_{1}&h_{2})|k)
Proof THEOR15 and (7.xii)
THEOR27 If (k=(h_{1}↓ h_{2}↓...↓ h_{n})) then P(h_{1}↓ h_{2}↓...↓ h_{n}|k) = P(h_{1}|k) + P(h_{2}|k) +...+ P(h_{n}|k) = 1
Proof THEOR19 and (7.xii).
THEOR28
If (~ (k&h_{1}=⊥) & (h_{1}⊃ h_{2}) & μ_{k}(h_{2})< μ_{k}(k)) then
then P(h_{1}|k&h_{2})>P(h_{1}|k)
Proof AX6: μ_{k}(h_{1})>0; P(h_{1}|k&h_{2})= μ_{k&2}(h_{1}) / μ_{k&h2}(k&h_{2}) = μ_{k}(h_{1}&h_{2})/ μ_{k}(h_{2}) =
= μ_{k}(h_{1})/μ_{k}(h_{2}) > μ_{k}(h_{1})/μ_{k}(k).
THEOR28 is a milestone along the way to inductive inference, since it states that if a coherent h_{1} entails a consequence h_{2} not k-entailed then the acquirement of h_{2} increases the probability of (validates) h_{1} given k.
THEOR29
If (P(h_{1}&h_{2}|k)=1), then (P(h_{1}|k)=1)
Proof From THEOR26 and THEOR22.
THEOR30
If (P(h_{1}|k)=0), then (P(h_{1}&h_{2}|k)=0)
THEOR31
((μ_{k°}(k°)= Σsub>j μ_{k°}(h_{j})) ⊃ (μ_{k°&e}(k°&e)= Σsub>j μ_{k°&e}(h_{j}))
(a proper partition of k° is also a proper partition of k°&e)
Proof From THEOR19 (k° = h_{1}↓ h_{2}↓…↓ h_{n}) ⊃ (μ_{k°}(k°)= Σsub>j μ_{k°}(h_{j}))
THEOR32 (k°= h_{1}↓ h_{2}↓…↓ h_{n}) ⊃ (Σsub>iP(h_{j}|k°&e)=1)
THEOR33
(k° = h_{1}↓ h_{2}↓…↓ h_{n}) ⊃
⊃ (P(h_{j}|k°&e) = ((P(h_{j}|k°)P(e|k°&h_{j})) / (Σsub>j(P(h_{j}|k°)P(e|k°&h_{j})))
(THEOR33 is Bayes's Theorem in its complete formulation).