Add Topic: Probability (#13)

Add the file Topics.Probability with some simple exercises about probability theory, independence of sets, definition of conditional probability and Bayes' theorem.
PatrickMassot · Mar 7, 2025 · 54f331f · 54f331f
1 parent 5d4279d
commit 54f331f
Show file tree

Hide file tree

Showing 11 changed files with 371 additions and 6 deletions.
diff --git a/GlimpseOfLean.lean b/GlimpseOfLean.lean
@@ -9,3 +9,4 @@ import GlimpseOfLean.Solutions.Topics.ChineseRemainder
 import GlimpseOfLean.Solutions.Topics.GaloisAdjunctions
 import GlimpseOfLean.Solutions.Topics.ClassicalPropositionalLogic
 import GlimpseOfLean.Solutions.Topics.IntuitionisticPropositionalLogic
+import GlimpseOfLean.Solutions.Topics.Probability
diff --git a/GlimpseOfLean/Exercises/01Rewriting.lean b/GlimpseOfLean/Exercises/01Rewriting.lean
@@ -210,4 +210,3 @@ a Lean proof looks like and have learned about the following tactics:
 * `exact`
 * `calc`
 -/
-
diff --git a/GlimpseOfLean/Exercises/02Iff.lean b/GlimpseOfLean/Exercises/02Iff.lean
@@ -190,4 +190,3 @@ equivalences. You learned about tactics:
 * `have`
 * `constructor`
 -/
-
diff --git a/GlimpseOfLean/Exercises/03Forall.lean b/GlimpseOfLean/Exercises/03Forall.lean
@@ -209,9 +209,10 @@ You can start with specialized files in the `Topics` folder. You have choice bet
   and do intuitionistic propositional logic.
 * `SequenceLimit` (easier, math) if you want to do some elementary calculus.
   For this file it is recommended to do `04Exists` first.
+* `Probability` (easier, math) if you want to work with probability measures,
+  independent sets, and conditional probability, including Bayes' Theorem.
 * `GaloisAdjunctions` (harder, math) if you want some more abstraction
   and learn how to prove things about adjunctions between complete lattices.
   It ends with a constructor of the product topology and its universal property
   manipulating as few open sets as possible.
 -/
-
diff --git a/GlimpseOfLean/Exercises/Topics/GaloisAdjunctions.lean b/GlimpseOfLean/Exercises/Topics/GaloisAdjunctions.lean
@@ -519,4 +519,3 @@ lemma push_generate (f : G →* G') : push f ∘ generate = generate ∘ (Set.im
 
 end Subgroups
 end Tutorial
-
diff --git a/GlimpseOfLean/Exercises/Topics/Probability.lean b/GlimpseOfLean/Exercises/Topics/Probability.lean
@@ -0,0 +1,142 @@
+import Mathlib.Probability.Notation
+import GlimpseOfLean.Library.Probability
+set_option linter.unusedSectionVars false
+set_option autoImplicit false
+set_option linter.unusedTactic false
+noncomputable section
+open scoped ENNReal
+/-
+
+# Probability measures, independent sets
+
+We introduce a probability space and events (measurable sets) on that space. We then define
+independence of events and conditional probability, and prove results relating those two notions.
+
+Mathlib has a (different) definition for independence of sets and also has a conditional measure
+given a set. Here we practice instead on simple new definitions to apply the tactics introduced in
+the previous files.
+-/
+
+/- We open namespaces. The effect is that after that command, we can call lemmas in those namespaces
+without their namespace prefix: for example, we can write `inter_comm` instead of `Set.inter_comm`.
+Hover over `open` if you want to learn more. -/
+open MeasureTheory ProbabilityTheory Set
+
+/- We define a measure space `Ω`: the `MeasureSpace Ω` variable states that `Ω` is a measurable
+space on which there is a canonical measure `volume`, with notation `ℙ`.
+We then state that `ℙ` is a probability measure. That is, `ℙ univ = 1`, where `univ : Set Ω` is the
+universal set in `Ω` (the set that contains all `x : Ω`). -/
+variable {Ω : Type} [MeasureSpace Ω] [IsProbabilityMeasure (ℙ : Measure Ω)]
+
+-- `A, B` will denote sets in `Ω`.
+variable {A B : Set Ω}
+
+/- One can take the measure of a set `A`: `ℙ A : ℝ≥0∞`.
+`ℝ≥0∞`, or `ENNReal`, is the type of extended non-negative real numbers, which contain `∞`.
+Measures can in general take infinite values, but since our `ℙ` is a probability measure,
+it actually takes only values up to 1.
+`simp` knows that a probability measure is finite and will use the lemmas `measure_ne_top`
+or `measure_lt_top` to prove that `ℙ A ≠ ∞` or `ℙ A < ∞`.
+
+Hint: use `#check measure_ne_top` to see what that lemma does.
+
+The operations on `ENNReal` are not as nicely behaved as on `ℝ`: `ENNReal` is not a ring and
+subtraction truncates to zero for example. If you find that lemma `lemma_name` used to transform
+an equation does not apply to `ENNReal`, try to find a lemma named something like
+`ENNReal.lemma_name_of_something` and use that instead. -/
+
+/-- Two sets `A, B` are independent for the ambient probability measure `ℙ` if
+`ℙ (A ∩ B) = ℙ A * ℙ B`. -/
+def IndepSet (A B : Set Ω) : Prop := ℙ (A ∩ B) = ℙ A * ℙ B
+
+/-- If `A` is independent of `B`, then `B` is independent of `A`. -/
+lemma IndepSet.symm : IndepSet A B → IndepSet B A := by {
+  sorry
+}
+
+/- Many lemmas in measure theory require sets to be measurable (`MeasurableSet A`).
+If you are presented with a goal like `⊢ MeasurableSet (A ∩ B)`, try the `measurability` tactic.
+That tactic produces measurability proofs. -/
+
+-- Hints: `compl_eq_univ_diff`, `measure_diff`, `inter_univ`, `measure_compl`, `ENNReal.mul_sub`
+lemma IndepSet.compl_right (hA : MeasurableSet A) (hB : MeasurableSet B) :
+    IndepSet A B → IndepSet A Bᶜ := by {
+  sorry
+}
+
+/- Apply `IndepSet.compl_right` to prove this generalization. It is good practice to add the iff
+version of some frequently used lemmas, this allows us to use them inside `rw` tactics. -/
+lemma IndepSet.compl_right_iff (hA : MeasurableSet A) (hB : MeasurableSet B) :
+    IndepSet A Bᶜ ↔ IndepSet A B := by {
+  sorry
+}
+
+-- Use what you have proved so far
+lemma IndepSet.compl_left (hA : MeasurableSet A) (hB : MeasurableSet B) (h : IndepSet A B) :
+    IndepSet Aᶜ B := by{
+  sorry
+}
+
+/- Can you write and prove a lemma `IndepSet.compl_left_iff`, following the examples above?-/
+
+-- Your lemma here
+
+-- Hint: `ENNReal.mul_self_eq_self_iff`
+lemma indep_self (h : IndepSet A A) : ℙ A = 0 ∨ ℙ A = 1 := by {
+  sorry
+}
+
+/-
+
+### Conditional probability
+
+-/
+
+/-- The probability of set `A` conditioned on `B`. -/
+def condProb (A B : Set Ω) : ENNReal := ℙ (A ∩ B) / ℙ B
+
+/- We define a notation for `condProb A B` that makes it look more like paper math. -/
+notation3 "ℙ("A"|"B")" => condProb A B
+
+/- Now that we have defined `condProb`, we want to use it, but Lean knows nothing about it.
+We could start every proof with `rw [condProb]`, but it is more convenient to write lemmas about the
+properties of `condProb` first and then use those. -/
+
+-- Hint : `measure_inter_null_of_null_left`
+@[simp] -- this makes the lemma usable by `simp`
+lemma condProb_zero_left (A B : Set Ω) (hA : ℙ A = 0) : ℙ(A|B) = 0 := by {
+  sorry
+}
+
+@[simp]
+lemma condProb_zero_right (A B : Set Ω) (hB : ℙ B = 0) : ℙ(A|B) = 0 := by {
+  sorry
+}
+
+/- What other basic lemmas could be useful? Are there other "special" sets for which `condProb`
+takes known values? -/
+
+-- Your lemma(s) here
+
+/- The next statement is a `theorem` and not a `lemma`, because we think it is important.
+There is no functional difference between those two keywords. -/
+
+/-- **Bayes Theorem** -/
+theorem bayesTheorem (hA : ℙ A ≠ 0) (hB : ℙ B ≠ 0) : ℙ(A|B) = ℙ A * ℙ(B|A) / ℙ B := by {
+  sorry
+}
+
+/- Did you really need all those hypotheses?
+In Lean, division by zero follows the convention that `a/0 = 0` for all a. This means we can prove
+Bayes' Theorem without requiring `ℙ A ≠ 0` and `ℙ B ≠ 0`. However, this is a quirk of the
+formalization rather than the standard mathematical statement.
+If you want to know more about how division works in Lean, try to hover over `/` with your mouse. -/
+
+theorem bayesTheorem' (A B : Set Ω) : ℙ(A|B) = ℙ A * ℙ(B|A) / ℙ B := by {
+  sorry
+}
+
+lemma condProb_of_indepSet (h : IndepSet B A) (hB : ℙ B ≠ 0) : ℙ(A|B) = ℙ A := by {
+  sorry
+}
+
diff --git a/GlimpseOfLean/Library/Basic.lean b/GlimpseOfLean/Library/Basic.lean
@@ -56,4 +56,3 @@ def my : Delab := do
   let stx ← delab args[0]!
   let e := mkIdent `exp
   `(term|$e $stx)
-
diff --git a/GlimpseOfLean/Library/Probability.lean b/GlimpseOfLean/Library/Probability.lean
@@ -0,0 +1,14 @@
+import Mathlib.Data.ENNReal.Operations
+
+lemma ENNReal.mul_self_eq_self_iff (a : ENNReal) : a * a = a ↔ a = 0 ∨ a = 1 ∨ a = ⊤ := by
+  by_cases h : a = 0
+  · simp [h]
+  by_cases h' : a = ⊤
+  · simp [h']
+  simp only [h, h', or_false, false_or]
+  constructor
+  · intro h''
+    nth_rw 3 [← one_mul a] at h''
+    exact (ENNReal.mul_left_inj h h').mp h''
+  · intro h''
+    simp [h'']
diff --git a/GlimpseOfLean/Solutions/03Forall.lean b/GlimpseOfLean/Solutions/03Forall.lean
@@ -224,6 +224,8 @@ You can start with specialized files in the `Topics` folder. You have choice bet
   and do intuitionistic propositional logic.
 * `SequenceLimit` (easier, math) if you want to do some elementary calculus.
   For this file it is recommended to do `04Exists` first.
+* `Probability` (easier, math) if you want to work with probability measures,
+  independent sets, and conditional probability, including Bayes' Theorem.
 * `GaloisAdjunctions` (harder, math) if you want some more abstraction
   and learn how to prove things about adjunctions between complete lattices.
   It ends with a constructor of the product topology and its universal property
Original file line number	Diff line number	Diff line change
Expand Up		@@ -210,4 +210,3 @@ a Lean proof looks like and have learned about the following tactics:
		* `exact`
		* `calc`
		-/
Original file line number	Diff line number	Diff line change
Expand Up		@@ -190,4 +190,3 @@ equivalences. You learned about tactics:
		* `have`
		* `constructor`
		-/
Original file line number	Diff line number	Diff line change
Expand Up		@@ -519,4 +519,3 @@ lemma push_generate (f : G →* G') : push f ∘ generate = generate ∘ (Set.im

		end Subgroups
		end Tutorial
Original file line number	Diff line number	Diff line change
Expand Up		@@ -56,4 +56,3 @@ def my : Delab := do
		let stx ← delab args[0]!
		let e := mkIdent `exp
		`(term\|$e $stx)