Jensen's inequality

From Wikimization

(Difference between revisions)
Jump to: navigation, search
m (Protected "Jensen's inequality" [edit=autoconfirmed:move=autoconfirmed])
Current revision (16:19, 23 June 2026) (edit) (undo)
 
(One intermediate revision not shown.)
Line 1: Line 1:
By definition <math>\,\phi\,</math> is convex if and only if
By definition <math>\,\phi\,</math> is convex if and only if
-
<math>\phi(ta + (1-t)b) \leq t \phi(a) + (1-t) \phi(b)</math>
+
<math>\phi(ta + (1-t)b)\,\leq\,t \phi(a) + (1-t) \phi(b)</math>
-
whenever <math>\,0 \leq t \leq 1\,</math> and <math>\,a\,, b\,</math> are in the domain of <math>\,\phi\,</math>.
+
whenever <math>\,0 \leq t \leq 1\,</math> and <math>\,a\,,\,b\,</math> are in the domain of <math>\,\phi\,</math>.
It follows by induction on
It follows by induction on
-
<math>\,n\,</math> that if <math>\,t_j \geq 0\,</math> for <math>\,j = 1, 2\ldots n\,</math> then
+
<math>\,n\,</math> that if <math>\,t_j \geq 0\,</math> for <math>\,j = 1,\,2\,\ldots\,n\,</math> then
<br><math>\phi(\sum t_j a_j) \leq \sum t_j \phi(a_j) </math> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (1)
<br><math>\phi(\sum t_j a_j) \leq \sum t_j \phi(a_j) </math> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (1)
<br>Jensen's inequality says this: <br>If <math>\,\mu\,</math> is a probability
<br>Jensen's inequality says this: <br>If <math>\,\mu\,</math> is a probability
-
measure on <math>\,X\,</math>&nbsp;,&nbsp; <br><math>\,f\,</math> is a real-valued function on <math>\,X\,</math>&nbsp;,&nbsp;
+
measure on <math>\,X\,,\,</math> <br><math>\,f\,</math> is a real-valued function on <math>\,X\,,\,</math>
<br><math>\,f\,</math> is integrable, and <br><math>\,\phi\,</math> is convex on the range
<br><math>\,f\,</math> is integrable, and <br><math>\,\phi\,</math> is convex on the range
of <math>\,f\,</math> then
of <math>\,f\,</math> then
Line 20: Line 20:
that <math>\,f\,</math> is simple. (This limiting argument is a missing detail to this proof...)
that <math>\,f\,</math> is simple. (This limiting argument is a missing detail to this proof...)
<br>That is, <math>\,X\,</math> is the disjoint union of <math>\,X_1 \,\ldots\, X_n\,</math>
<br>That is, <math>\,X\,</math> is the disjoint union of <math>\,X_1 \,\ldots\, X_n\,</math>
-
and <math>\,f\,</math> is constant on each <math>\,X_j\,</math>&nbsp;.
+
and <math>\,f\,</math> is constant on each <math>\,X_j\,.</math>
-
Say <math>\,t_j=\mu(X_j)\,</math> and <math>\,a_j\,</math> is the value of <math>\,f\,</math> on <math>\,X_j\,</math>&nbsp;.&nbsp;
+
Say <math>\,t_j=\mu(X_j)\,</math> and <math>\,a_j\,</math> is the value of <math>\,f\,</math> on <math>\,X_j\,.</math>
Then (1) and (2) say exactly the same thing. QED.
Then (1) and (2) say exactly the same thing. QED.
Line 30: Line 30:
Lemma. If <math>\,a < b,\, \,a' < b',\, \,a \leq a'\,</math> and <math>\,b \leq b'\,</math> then
Lemma. If <math>\,a < b,\, \,a' < b',\, \,a \leq a'\,</math> and <math>\,b \leq b'\,</math> then
-
<math>\,(f(a) - f(b)) / (a - b) \leq (f(a') - f(b')) / (a' - b')\quad\diamond</math>
+
<math>\,(f(a) - f(b)) / (a - b)\,\leq\,(f(a') - f(b')) / (a' - b')\quad\diamond</math>
The lemma shows:
The lemma shows:
*<math>\,\phi\,</math> has a right-hand derivative at every point, and
*<math>\,\phi\,</math> has a right-hand derivative at every point, and
-
*the graph of <math>\,\phi\,</math> lies above the "tangent" line through any point on the graph with slope equal to the right derivative.
+
*the graph of <math>\,\phi\,</math> lies above the tangent line through any point on the graph with slope equal to the right derivative.
Say <math>\,a = \int f d \mu\,</math>
Say <math>\,a = \int f d \mu\,</math>
Let <math>\,m\,</math> be the right derivative of <math>\,\phi\,</math>
Let <math>\,m\,</math> be the right derivative of <math>\,\phi\,</math>
-
at <math>\,a\,</math>&nbsp;,&nbsp; and let
+
at <math>\,a\,</math>,&nbsp; and let
<math>\,L(t) = \phi(a) + m(t-a)\,</math>
<math>\,L(t) = \phi(a) + m(t-a)\,</math>
The bullets above say <math>\,\phi(t)\geq L(t)\,</math> for
The bullets above say <math>\,\phi(t)\geq L(t)\,</math> for
-
all <math>\,t\,</math> in the domain of &nbsp;<math>\,\phi\,</math>&nbsp;. &nbsp;So
+
all <math>\,t\,</math> in the domain of <math>\,\phi\,.\,</math> So
<math>\begin{array}{rl}\int \phi \circ f &\geq \int L \circ f\\
<math>\begin{array}{rl}\int \phi \circ f &\geq \int L \circ f\\
Line 50: Line 50:
&= \phi(a) + (m \int f) - ma\\
&= \phi(a) + (m \int f) - ma\\
&= \phi(a)\\
&= \phi(a)\\
-
&= \phi(\int f)\end{array}</math>
+
&= \phi(\int f)\end{array}
 +
</math>
<math>\,-\,</math>David C. Ullrich
<math>\,-\,</math>David C. Ullrich

Current revision

By definition LaTeX: \,\phi\, is convex if and only if

LaTeX: \phi(ta + (1-t)b)\,\leq\,t \phi(a) + (1-t) \phi(b)

whenever LaTeX: \,0 \leq t \leq 1\, and LaTeX: \,a\,,\,b\, are in the domain of LaTeX: \,\phi\,.

It follows by induction on LaTeX: \,n\, that if LaTeX: \,t_j \geq 0\, for LaTeX: \,j = 1,\,2\,\ldots\,n\, then


LaTeX: \phi(\sum t_j a_j) \leq \sum t_j \phi(a_j)           (1)


Jensen's inequality says this:
If LaTeX: \,\mu\, is a probability measure on LaTeX: \,X\,,\,
LaTeX: \,f\, is a real-valued function on LaTeX: \,X\,,\,
LaTeX: \,f\, is integrable, and
LaTeX: \,\phi\, is convex on the range of LaTeX: \,f\, then


LaTeX: \phi(\int f d \mu) \leq \int \phi \circ f d \mu\qquad          (2)


Proof 1: By some limiting argument we can assume that LaTeX: \,f\, is simple. (This limiting argument is a missing detail to this proof...)
That is, LaTeX: \,X\, is the disjoint union of LaTeX: \,X_1 \,\ldots\, X_n\, and LaTeX: \,f\, is constant on each LaTeX: \,X_j\,.

Say LaTeX: \,t_j=\mu(X_j)\, and LaTeX: \,a_j\, is the value of LaTeX: \,f\, on LaTeX: \,X_j\,.

Then (1) and (2) say exactly the same thing. QED.


Proof 2:

Lemma. If LaTeX: \,a < b,\, \,a' < b',\, \,a \leq a'\, and LaTeX: \,b \leq b'\, then

LaTeX: \,(f(a) - f(b)) / (a - b)\,\leq\,(f(a') - f(b')) / (a' - b')\quad\diamond

The lemma shows:

  • LaTeX: \,\phi\, has a right-hand derivative at every point, and
  • the graph of LaTeX: \,\phi\, lies above the tangent line through any point on the graph with slope equal to the right derivative.

Say LaTeX: \,a = \int f d \mu\,

Let LaTeX: \,m\, be the right derivative of LaTeX: \,\phi\, at LaTeX: \,a\,,  and let

LaTeX: \,L(t) = \phi(a) + m(t-a)\,

The bullets above say LaTeX: \,\phi(t)\geq L(t)\, for all LaTeX: \,t\, in the domain of LaTeX: \,\phi\,.\, So

LaTeX: \begin{array}{rl}\int \phi \circ f &\geq \int L \circ f\\ 
</p>
<pre>        &= \int (\phi(a) + m(f - a))\\ 
        &= \phi(a) + (m \int f) - ma\\
        &= \phi(a)\\
        &= \phi(\int f)\end{array}
</pre>
<p>

LaTeX: \,-\,David C. Ullrich

Personal tools