Jensen's inequality

From Wikimization

Revision as of 22:07, 13 July 2008 by Kwokwah (Talk | contribs)

(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)

First, by definition $LaTeX: \,\phi\,$ is convex if and only if

$LaTeX: \phi(ta + (1-t)b) \leq t \phi(a) + (1-t) phi(b)$

whenever $LaTeX: \,0 \leq t \leq 1\,$ and $LaTeX: \,a, b\,$ are in the range of $LaTeX: \,phi\,$ . It follows by induction on $LaTeX: \,n\,$ that if $LaTeX: \,t_j \geq 0\,$ for $LaTeX: \,j = 1, 2\ldots n\,$ then

(iii) phi(sum t_j a_j) <= sum t_j phi(a_j).

Jensen's inequality says this: If mu is a probability measure on X, f is a real-valued function on X, f is integrable, and phi is convex on the range of f then

(iv) phi(int f d mu) <= int phi o f d mu.

Proof 1: By some limiting argument we can assume that f is simple (this limiting argument is the missing detail). That is, X is the disjoint union of X_1, ... X_n and f is constant on each X_j.

Say t_j = mu(X_j) and a_j is the value of f on X_j. Then (iii) and (iv) say exactly the same thing. QED.

That's worth noting because it seems to me it explains "why" the thing's true. Here's the elegant complete proof:

Proof 2: The lemma shows that phi has a right-hand derivative at every point and that the graph of phi lies above the "tangent" line through any point on the graph with slope = the right derivative.

Say a = int f d mu, let m = the right derivative of phi at a, and let

 L(t) = phi(a) + m(t-a).

The comment above says that phi(t) >= L(t) for all t in the range of phi. So

 int phi o f >= int L o f 
        = int (phi(a) + m(f - a)) 
        = phi(a) + (m int f) - ma 
        = phi(a) 
        = phi(int f).

Retrieved from "http://www.convexoptimization.com/wikimization/index.php/Jensen%27s_inequality"

Jensen's inequality

From Wikimization

Views

Personal tools

Navigation

Search

Toolbox