Jensen's inequality
From Wikimization
m (Protected "Jensen's inequality" [edit=autoconfirmed:move=autoconfirmed]) |
|||
| (One intermediate revision not shown.) | |||
| Line 1: | Line 1: | ||
By definition <math>\,\phi\,</math> is convex if and only if | By definition <math>\,\phi\,</math> is convex if and only if | ||
| - | <math>\phi(ta + (1-t)b) \leq t \phi(a) + (1-t) \phi(b)</math> | + | <math>\phi(ta + (1-t)b)\,\leq\,t \phi(a) + (1-t) \phi(b)</math> |
| - | whenever <math>\,0 \leq t \leq 1\,</math> and <math>\,a\,, b\,</math> are in the domain of <math>\,\phi\,</math>. | + | whenever <math>\,0 \leq t \leq 1\,</math> and <math>\,a\,,\,b\,</math> are in the domain of <math>\,\phi\,</math>. |
It follows by induction on | It follows by induction on | ||
| - | <math>\,n\,</math> that if <math>\,t_j \geq 0\,</math> for <math>\,j = 1, 2\ldots n\,</math> then | + | <math>\,n\,</math> that if <math>\,t_j \geq 0\,</math> for <math>\,j = 1,\,2\,\ldots\,n\,</math> then |
<br><math>\phi(\sum t_j a_j) \leq \sum t_j \phi(a_j) </math> (1) | <br><math>\phi(\sum t_j a_j) \leq \sum t_j \phi(a_j) </math> (1) | ||
<br>Jensen's inequality says this: <br>If <math>\,\mu\,</math> is a probability | <br>Jensen's inequality says this: <br>If <math>\,\mu\,</math> is a probability | ||
| - | measure on <math>\,X\,</math> | + | measure on <math>\,X\,,\,</math> <br><math>\,f\,</math> is a real-valued function on <math>\,X\,,\,</math> |
<br><math>\,f\,</math> is integrable, and <br><math>\,\phi\,</math> is convex on the range | <br><math>\,f\,</math> is integrable, and <br><math>\,\phi\,</math> is convex on the range | ||
of <math>\,f\,</math> then | of <math>\,f\,</math> then | ||
| Line 20: | Line 20: | ||
that <math>\,f\,</math> is simple. (This limiting argument is a missing detail to this proof...) | that <math>\,f\,</math> is simple. (This limiting argument is a missing detail to this proof...) | ||
<br>That is, <math>\,X\,</math> is the disjoint union of <math>\,X_1 \,\ldots\, X_n\,</math> | <br>That is, <math>\,X\,</math> is the disjoint union of <math>\,X_1 \,\ldots\, X_n\,</math> | ||
| - | and <math>\,f\,</math> is constant on each <math>\,X_j\,</math> | + | and <math>\,f\,</math> is constant on each <math>\,X_j\,.</math> |
| - | Say <math>\,t_j=\mu(X_j)\,</math> and <math>\,a_j\,</math> is the value of <math>\,f\,</math> on <math>\,X_j\,</math> | + | Say <math>\,t_j=\mu(X_j)\,</math> and <math>\,a_j\,</math> is the value of <math>\,f\,</math> on <math>\,X_j\,.</math> |
Then (1) and (2) say exactly the same thing. QED. | Then (1) and (2) say exactly the same thing. QED. | ||
| Line 30: | Line 30: | ||
Lemma. If <math>\,a < b,\, \,a' < b',\, \,a \leq a'\,</math> and <math>\,b \leq b'\,</math> then | Lemma. If <math>\,a < b,\, \,a' < b',\, \,a \leq a'\,</math> and <math>\,b \leq b'\,</math> then | ||
| - | <math>\,(f(a) - f(b)) / (a - b) \leq (f(a') - f(b')) / (a' - b')\quad\diamond</math> | + | <math>\,(f(a) - f(b)) / (a - b)\,\leq\,(f(a') - f(b')) / (a' - b')\quad\diamond</math> |
The lemma shows: | The lemma shows: | ||
*<math>\,\phi\,</math> has a right-hand derivative at every point, and | *<math>\,\phi\,</math> has a right-hand derivative at every point, and | ||
| - | *the graph of <math>\,\phi\,</math> lies above the | + | *the graph of <math>\,\phi\,</math> lies above the tangent line through any point on the graph with slope equal to the right derivative. |
Say <math>\,a = \int f d \mu\,</math> | Say <math>\,a = \int f d \mu\,</math> | ||
Let <math>\,m\,</math> be the right derivative of <math>\,\phi\,</math> | Let <math>\,m\,</math> be the right derivative of <math>\,\phi\,</math> | ||
| - | at <math>\,a\,</math> | + | at <math>\,a\,</math>, and let |
<math>\,L(t) = \phi(a) + m(t-a)\,</math> | <math>\,L(t) = \phi(a) + m(t-a)\,</math> | ||
The bullets above say <math>\,\phi(t)\geq L(t)\,</math> for | The bullets above say <math>\,\phi(t)\geq L(t)\,</math> for | ||
| - | all <math>\,t\,</math> in the domain of | + | all <math>\,t\,</math> in the domain of <math>\,\phi\,.\,</math> So |
<math>\begin{array}{rl}\int \phi \circ f &\geq \int L \circ f\\ | <math>\begin{array}{rl}\int \phi \circ f &\geq \int L \circ f\\ | ||
| Line 50: | Line 50: | ||
&= \phi(a) + (m \int f) - ma\\ | &= \phi(a) + (m \int f) - ma\\ | ||
&= \phi(a)\\ | &= \phi(a)\\ | ||
| - | &= \phi(\int f)\end{array}</math> | + | &= \phi(\int f)\end{array} |
| + | </math> | ||
<math>\,-\,</math>David C. Ullrich | <math>\,-\,</math>David C. Ullrich | ||
Current revision
By definition is convex if and only if
whenever and
are in the domain of
.
It follows by induction on
that if
for
then
(1)
Jensen's inequality says this:
If is a probability
measure on
is a real-valued function on
is integrable, and
is convex on the range
of
then
(2)
Proof 1: By some limiting argument we can assume
that is simple. (This limiting argument is a missing detail to this proof...)
That is, is the disjoint union of
and
is constant on each
Say and
is the value of
on
Then (1) and (2) say exactly the same thing. QED.
Proof 2:
Lemma. If and
then
The lemma shows:
has a right-hand derivative at every point, and
- the graph of
lies above the tangent line through any point on the graph with slope equal to the right derivative.
Say
Let be the right derivative of
at
, and let
The bullets above say for
all
in the domain of
So
David C. Ullrich