Jensen's inequality
From Wikimization
Kwokwah (Talk | contribs)
(New page: First, by definition <math>\,\phi\,</math> is convex if and only if <math>\phi(ta + (1-t)b) \leq t \phi(a) + (1-t) phi(b)</math> whenever <math>\,0 \leq t \leq 1\,</math> and <math>\,a...)
Next diff →
Revision as of 22:07, 13 July 2008
First, by definition is convex if and only if
whenever and
are in the range of
.
It follows by induction on
that if
for
then
- (iii) phi(sum t_j a_j) <= sum t_j phi(a_j).
Jensen's inequality says this: If mu is a probability
measure on X, f is a real-valued function on X,
f is integrable, and phi is convex on the range
of f then
(iv) phi(int f d mu) <= int phi o f d mu.
Proof 1: By some limiting argument we can assume
that f is simple (this limiting argument is the missing
detail). That is, X is the disjoint union of X_1, ... X_n
and f is constant on each X_j.
Say t_j = mu(X_j) and a_j is the value of f on X_j.
Then (iii) and (iv) say exactly the same thing. QED.
That's worth noting because it seems to me it explains
"why" the thing's true. Here's the elegant complete proof:
Proof 2: The lemma shows that phi has a right-hand
derivative at every point and that the graph of phi
lies above the "tangent" line through any point on the
graph with slope = the right derivative.
Say a = int f d mu, let m = the right derivative of phi
at a, and let
L(t) = phi(a) + m(t-a).
The comment above says that phi(t) >= L(t) for
all t in the range of phi. So
int phi o f >= int L o f
= int (phi(a) + m(f - a))
= phi(a) + (m int f) - ma
= phi(a)
= phi(int f).