Jensen's inequality

From Wikimization

Revision as of 22:07, 13 July 2008 by Kwokwah (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

First, by definition LaTeX: \,\phi\, is convex if and only if

LaTeX: \phi(ta + (1-t)b) \leq t \phi(a) + (1-t) phi(b)


whenever LaTeX: \,0 \leq t \leq 1\, and LaTeX: \,a, b\, are in the range of LaTeX: \,phi\,. It follows by induction on LaTeX: \,n\, that if LaTeX: \,t_j \geq 0\, for LaTeX: \,j = 1, 2\ldots n\, then


  • (iii) phi(sum t_j a_j) <= sum t_j phi(a_j).


Jensen's inequality says this: If mu is a probability measure on X, f is a real-valued function on X, f is integrable, and phi is convex on the range of f then


(iv) phi(int f d mu) <= int phi o f d mu.


Proof 1: By some limiting argument we can assume that f is simple (this limiting argument is the missing detail). That is, X is the disjoint union of X_1, ... X_n and f is constant on each X_j.


Say t_j = mu(X_j) and a_j is the value of f on X_j. Then (iii) and (iv) say exactly the same thing. QED.


That's worth noting because it seems to me it explains "why" the thing's true. Here's the elegant complete proof:


Proof 2: The lemma shows that phi has a right-hand derivative at every point and that the graph of phi lies above the "tangent" line through any point on the graph with slope = the right derivative.


Say a = int f d mu, let m = the right derivative of phi at a, and let


 L(t) = phi(a) + m(t-a). 


The comment above says that phi(t) >= L(t) for all t in the range of phi. So


 int phi o f >= int L o f 
        = int (phi(a) + m(f - a)) 
        = phi(a) + (m int f) - ma 
        = phi(a) 
        = phi(int f).
Personal tools