Skip to content

Commit 87ecc06

Browse files
committed
update
1 parent 4c18eb8 commit 87ecc06

8 files changed

+120
-12
lines changed

wiki/tiddlers/$__StoryList.tid

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
created: 20200909030202412
2-
list: [[Adversarial Examples]] [[MSR Adversarial Machine Learning]] [[Waymo Open Dataset]] [[Welcome Page]] TableOfContents
3-
modified: 20200929025750948
1+
created: 20201105073859039
2+
list: [[Statistical Rethinking]] [[Welcome Page]] TableOfContents
3+
modified: 20201106044927092
44
title: $:/StoryList
55
type: text/vnd.tiddlywiki

wiki/tiddlers/Bayesian.tid

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
created: 20161102054547779
2-
modified: 20200417025214337
2+
modified: 20201104063335825
33
title: Bayesian
44
type: text/vnd.tiddlywiki
55

@@ -10,6 +10,7 @@ type: text/vnd.tiddlywiki
1010
!! Textbooks
1111

1212
* [[Bayesian Choice]]
13+
* [[Statistical Rethinking]]
1314

1415
! Applications
1516
* [[Bayesian information criterion]]

wiki/tiddlers/Mirrors.tid

+11
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
created: 20201102072022959
2+
modified: 20201104080627011
3+
tags: Tools
4+
title: Mirrors
5+
type: text/vnd.tiddlywiki
6+
7+
https://mirrors.bfsu.edu.cn/
8+
9+
!! Pip
10+
11+
`pip install -i https://mirrors.bfsu.edu.cn/pypi/web/simple some-package`

wiki/tiddlers/Statistical Learning Theory.tid

+81-5
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
created: 20200508052436114
2-
modified: 20200512115152389
2+
modified: 20201106030038265
33
tags: Tutorials NIPS19
44
title: Statistical Learning Theory
55
type: text/vnd.tiddlywiki
@@ -8,7 +8,7 @@ type: text/vnd.tiddlywiki
88

99
!! First Generation SLT
1010

11-
For one fixed (non data-dependent) $$h$$:
11+
Empirical Risk: For one fixed (non data-dependent) $$h$$:
1212

1313
$$
1414
\mathbb E[R_{in}(h)] = \mathbb E[\frac1m\sum_{i=1}^m l(h(X_i), Y_i)] = R_{out}(h)
@@ -21,10 +21,86 @@ $$
2121
\mathbf P^m[\Delta(h)>\epsilon]\le \exp(-2m\epsilon^2) = \delta
2222
$$
2323

24-
$$\delta$$ is the confidence. With probability $$\ge 1-\delta$$
24+
$$\delta$$ is the confidence. With probability $$\ge 1-\delta$$.
25+
26+
Theoretical Risk:
2527

2628
$$
27-
R_{out}(h)\le\R_{in}(h) +\sqrt{\frac{1}{2m}\log(\frac{1}{\delta})}
29+
R_{out}(h)\le R_{in}(h) +\sqrt{\frac{1}{2m}\log(\frac{1}{\delta})}
2830
$$
2931

30-
!! Finite function class
32+
!! Finite function class
33+
34+
* Structural Risk Minimization
35+
* VC dimension, Rademacher complexity
36+
37+
[[IAS seminar|https://www.bilibili.com/video/BV14541187Ln]]
38+
39+
!! PAC-Bayes framework (Generalised Bayes)
40+
41+
* Before data, fix a distribution $$P\in M_1(\mathcal H)$$ "prior"
42+
* Based on data, learn a distribution $$Q\in M_1(\mathcal H)$$ "posterior"
43+
* Predictions:
44+
** draw $$h\sim Q$$ and predict with the chosen $$h$$.
45+
** each prediction wiht a fresh random draw.
46+
47+
The @@color:red;risk measures@@ $$R_{in}(h)$$ and $$R_{out}(h)$$ are @@color:red;extended by averaging@@:
48+
49+
$$
50+
R(Q) \equiv \int_{\mathcal H}R(h)dQ(h)
51+
$$
52+
53+
!!! PAC-Bayes vs Bayesian learning
54+
55+
* Prior
56+
** PAC-Bayes: bounds hold for any distribution
57+
** Bayes: prior choice impacts inference
58+
* Posterior
59+
** PAC-Bayes: bounds hold for any distribution
60+
** Bayes: posterior uniquely defined by prior and statistical model
61+
* Data distribution
62+
** PAC-Bayes: bounds hold for any distribution
63+
** Bayes: randomness lies in the noise model generating the output
64+
65+
!!! A General PAC-Bayesian Theorem
66+
$$\Delta$$-function: "distance" between $$R_{in}(Q)$$ and $$R_{out}(Q)$$
67+
68+
Convex function $$\Delta: [0,1] \times [0,1]\rightarrow \mathbb R$$
69+
70+
For any distribution $$D$$ on $$\mathcal X\times \mathcal Y$$, for any set $$\mathcal H$$ of voters, for any distribution $$P$$ on $$\mathcal H$$, for any $$\delta\in[0, 1]$$, and for any $$\Delta$$-function, we have, with probability at least $$1-\delta$$ over the choice of $$S\sim D^m$$,
71+
72+
$$
73+
\forall Q \text{ on } \mathcal H: \Delta(R_{in}(Q), R_{out}(Q))\le\frac1m[KL(Q\|P)+\ln\frac{\mathcal J_\Delta(m)}{\delta}]
74+
$$
75+
76+
Proof: Change of measure inequality and Markov's inequality
77+
78+
!! Linear classifiers
79+
80+
* choose prior and posterior to be Gaussians
81+
* $$P$$ centered at the origin
82+
* $$Q\sim \mathcal N(\mathbf w, \mu)$$
83+
84+
Linear classifiers performance may be bounded by:
85+
$$
86+
KL(\hat Q_S(\mathbf w, \mu)\|Q_D(\mathbf w, \mu))\le\frac1m ( KL(P\|Q(\mathbf w, \mu)) +\ln\frac{m+1}{\delta})
87+
$$
88+
89+
!!! Data- or distribution-dependent priors
90+
91+
* using part of the data to learn the prior for SVMs
92+
* defining the prior in terms of the data generating distribution (aka localised PAC-Bayes)
93+
94+
$$\eta$$Prior SVM in case VC dimension does not work
95+
96+
* @@color:green;Bounds are tight@@
97+
* @@color:green;Model selection from the bounds is as good as 10FCV@@
98+
* @@color:red;The better bounds do not appear to give better model selection@@
99+
100+
!! Performance of deep NNs
101+
102+
* For SVMs we can think of the margin as capturing an accuracy with which we need to estimate the weights
103+
* If we have a deep network solution with a wide basin of good performance we can take a similar approach using PAC-Bayes with a broad posterior around the solution
104+
* (Dziugaite and Roy + Neyshabur) have derived some of the tightest deep learning bounds in this way
105+
** by training to expand the basin of attraction
106+
** hence not measuring good generalisation of normal training
+8
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
created: 20201104063347569
2+
modified: 20201104080641257
3+
tags: Bayesian
4+
title: Statistical Rethinking
5+
type: text/vnd.tiddlywiki
6+
7+
* numpyro implementation: https://fehiepsi.github.io/rethinking-numpyro
8+
* videos: https://www.bilibili.com/video/BV1ya411A7ih

wiki/tiddlers/Tools.tid

+4-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
color: #501464
22
created: 20141010102453511
3-
modified: 20200214060205626
3+
modified: 20201104082459801
44
tags: Programming
55
title: Tools
66
type: text/vnd.tiddlywiki
@@ -16,4 +16,6 @@ type: text/vnd.tiddlywiki
1616
* [[LaTeX]]
1717
* [[Magit]]
1818
* [[Dot Files]]
19-
* [[Kitty]]
19+
* [[Kitty]]
20+
* [[Mirrors]]
21+
* [[Websites]]

wiki/tiddlers/Transformer.tid

+4-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
created: 20181106063411015
2-
modified: 20200224020133162
2+
modified: 20201105073443693
33
tags: [[Sequential Models]]
44
title: Transformer
55
type: text/vnd.tiddlywiki
@@ -8,5 +8,8 @@ type: text/vnd.tiddlywiki
88
* [[Transformer XL]]
99
* [[Compressed Transformer]]
1010

11+
! Vision
12+
* ViT (pytorch repo: https://github.com/jeonsworld/ViT-pytorch)
13+
1114
! Reinforcement learning
1215
* [[Iterated Amplification]]

wiki/tiddlers/Websites.tid

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
created: 20201104082531935
2+
modified: 20201104082541228
3+
tags: Tools
4+
title: Websites
5+
type: text/vnd.tiddlywiki
6+
7+
* eBooks: https://b-ok.global/

0 commit comments

Comments
 (0)