5. 微分中值定理及其应用
5.1 微分中值定理
5.1.4 一阶导数与单调性的关系
【定理5.1.5】【一阶导数与单调性的关系】
- f ( x ) f(x) f(x)在区间 I \textbf{I} I(可以是开区间,也可以闭区间,也可以半开半闭区间)定义且可导,则 f ( x ) f(x) f(x)在 I \textbf{I} I上单调增加的充分必要条件是: f ′ ( x ) ≥ 0 , ∀ x ∈ I f'(x)\ge 0,\forall x\in\textbf{I} f′(x)≥0,∀x∈I.
- (充分条件)若 ∀ x ∈ I , f ′ ( x ) > 0 \forall x\in\textbf{I},f'(x)>0 ∀x∈I,f′(x)>0,则 f ( x ) f(x) f(x)在 I \textbf{I} I上严格单调增加。
【反例】 f ( x ) = x 3 , f ( x ) f(x)=x^3,f(x) f(x)=x3,f(x)严格单调增加,但是 f ′ ( 0 ) = 0 f'(0)=0 f′(0)=0 - 若 f ( x ) f(x) f(x)在 I \textbf{I} I上连续,除了有限个点 x 1 , x 2 , x 3 , . . . , x n x_1,x_2,x_3,...,x_n x1,x2,x3,...,xn外 f ′ ( x ) > 0 f'(x)>0 f′(x)>0,则 f ( x ) f(x) f(x)在 I \textbf{I} I上严格单调增加。
- f ( x ) f(x) f(x)在区间 I \textbf{I} I(可以是开区间,也可以闭区间,也可以半开半闭区间)定义且可导,则 f ( x ) f(x) f(x)在 I \textbf{I} I上单调减少的充分必要条件是: f ′ ( x ) ≤ 0 , ∀ x ∈ I f'(x)\le 0,\forall x\in\textbf{I} f′(x)≤0,∀x∈I.
- (充分条件)若 ∀ x ∈ I , f ′ ( x ) < 0 \forall x\in\textbf{I},f'(x)<0 ∀x∈I,f′(x)<0,则 f ( x ) f(x) f(x)在 I \textbf{I} I上严格单调减少。
- 若 f ( x ) f(x) f(x)在 I \textbf{I} I上连续,除了有限个点 x 1 , x 2 , x 3 , . . . , x n x_1,x_2,x_3,...,x_n x1,x2,x3,...,xn外 f ′ ( x ) < 0 f'(x)<0 f′(x)<0,则 f ( x ) f(x) f(x)在 I \textbf{I} I上严格单调减少。
【证】先证充分性,
∀ x 1 , x 2 ∈ I \forall x_1,x_2\in\textbf{I} ∀x1,x2∈I,不妨设 x 1 < x 2 x_1<x_2 x1<x2,由拉格朗日中值定理得
f ( x 2 ) − f ( x 1 ) = f ′ ( ξ ) ( x 2 − x 1 ) , ξ ∈ ( x 1 , x 2 ) f(x_2)-f(x_1)=f'(\xi)(x_2-x_1),\xi\in(x_1,x_2) f(x2)−f(x1)=f′(ξ)(x2−x1),ξ∈(x1,x2)
由于 f ′ ( x ) ≥ 0 , ∀ x ∈ I , x 2 − x 1 > 0 f'(x)\ge 0,\forall x\in\textbf{I},x_2-x_1>0 f′(x)≥0,∀x∈I,x2−x1>0
所以 f ( x 2 ) − f ( x 1 ) > 0 f(x_2)-f(x_1)>0 f(x2)−f(x1)>0即 f ( x 2 ) > f ( x 1 ) f(x_2)>f(x_1) f(x2)>f(x1)
所以 f ( x ) f(x) f(x)在 I \textbf{I} I上单调增加。
再证必要性,
∀ x ∈ I , x ′ ≠ x , x ′ ∈ I \forall x\in\textbf{I},x'\ne x,x'\in\textbf{I} ∀x∈I,x′=x,x′∈I
由于 f ( x ) f(x) f(x)单调增加,若 x ′ > x , f ( x ′ ) > f ( x ) x'>x,f(x')>f(x) x′>x,f(x′)>f(x)
f ( x ′ ) − f ( x ) x ′ − x ≥ 0 \frac{f(x')-f(x)}{x'-x}\ge 0 x′−xf(x′)−f(x)≥0
若 x ′ < x , f ( x ′ ) < f ( x ) x'<x,f(x')<f(x) x′<x,f(x′)<f(x)
f ( x ′ ) − f ( x ) x ′ − x ≥ 0 \frac{f(x')-f(x)}{x'-x}\ge 0 x′−xf(x′)−f(x)≥0
f ′ ( x ) = lim x ′ → x f ( x ′ ) − f ( x ) x ′ − x ≥ 0 f'(x)=\lim\limits_{x'\to x}\frac{f(x')-f(x)}{x'-x}\ge 0 f′(x)=x′→xlimx′−xf(x′)−f(x)≥0
5.1.5 函数的凸性
这本书按上凸和下凸定义所谓的凹凸函数。
以下凸为例:
下凸就是弦在曲线上方, λ ∈ ( 0 , 1 ) \lambda\in(0,1) λ∈(0,1), x 1 x_1 x1与 x 2 x_2 x2中间的点的横坐标可以表示为 ( x 2 − x 1 ) λ + x 1 (x_2-x_1)\lambda+x_1 (x2−x1)λ+x1,也可以表示为 x 2 − ( x 2 − x 1 ) λ = ( 1 − λ ) x 2 + λ x 1 x_2-(x_2-x_1)\lambda=(1-\lambda)x_2+\lambda x_1 x2−(x2−x1)λ=(1−λ)x2+λx1
其纵坐标,也是类似方法得出(梯形的比例关系相似得到)
则能得到一个不等式 f ( λ x 1 + ( 1 − λ ) x 2 ) ≤ λ 1 f ( x 1 ) + ( 1 − λ ) f ( x 2 ) f(\lambda x_1+(1-\lambda)x_2)\le \lambda_1f(x_1)+(1-\lambda)f(x_2) f(λx1+(1−λ)x2)≤λ1f(x1)+(1−λ)f(x2)
【定义5.1.2】
- 设 f ( x ) f(x) f(x)在区间 I \textbf{I} I上有定义,若 ∀ x 1 , x 2 ∈ I , ∀ λ ∈ ( 0 , 1 ) \forall x_1,x_2\in\textbf{I},\forall \lambda\in(0,1) ∀x1,x2∈I,∀λ∈(0,1)成立 f ( λ x 1 + ( 1 − λ ) x 2 ) ≤ λ f ( x 1 ) + ( 1 − λ ) f ( x 2 ) f(\lambda x_1+(1-\lambda)x_2)\le \lambda f(x_1)+(1-\lambda)f(x_2) f(λx1+(1−λ)x2)≤λf(x1)+(1−λ)f(x2)(弦在曲线上方,切线在曲线的下方),则称 f ( x ) f(x) f(x)在区间 I \textbf{I} I上是下凸函数。
- 设 f ( x ) f(x) f(x)在区间 I \textbf{I} I上有定义,若 ∀ x 1 , x 2 ∈ I , ∀ λ ∈ ( 0 , 1 ) \forall x_1,x_2\in\textbf{I},\forall \lambda\in(0,1) ∀x1,x2∈I,∀λ∈(0,1)成立 f ( λ x 1 + ( 1 − λ ) x 2 ) < λ f ( x 1 ) + ( 1 − λ ) f ( x 2 ) f(\lambda x_1+(1-\lambda)x_2)< \lambda f(x_1)+(1-\lambda)f(x_2) f(λx1+(1−λ)x2)<λf(x1)+(1−λ)f(x2),则称 f ( x ) f(x) f(x)在区间 I \textbf{I} I上是严格下凸函数。
- 设 f ( x ) f(x) f(x)在区间 I \textbf{I} I上有定义,若 ∀ x 1 , x 2 ∈ I , ∀ λ ∈ ( 0 , 1 ) \forall x_1,x_2\in\textbf{I},\forall \lambda\in(0,1) ∀x1,x2∈I,∀λ∈(0,1)成立 f ( λ x 1 + ( 1 − λ ) x 2 ) ≥ λ f ( x 1 ) + ( 1 − λ ) f ( x 2 ) f(\lambda x_1+(1-\lambda)x_2)\ge \lambda f(x_1)+(1-\lambda)f(x_2) f(λx1+(1−λ)x2)≥λf(x1)+(1−λ)f(x2)(弦在曲线下方,切线在曲线的上方),则称 f ( x ) f(x) f(x)在区间 I \textbf{I} I上是上凸函数。
- 设 f ( x ) f(x) f(x)在区间 I \textbf{I} I上有定义,若 ∀ x 1 , x 2 ∈ I , ∀ λ ∈ ( 0 , 1 ) \forall x_1,x_2\in\textbf{I},\forall \lambda\in(0,1) ∀x1,x2∈I,∀λ∈(0,1)成立 f ( λ x 1 + ( 1 − λ ) x 2 ) > λ f ( x 1 ) + ( 1 − λ ) f ( x 2 ) f(\lambda x_1+(1-\lambda)x_2)> \lambda f(x_1)+(1-\lambda)f(x_2) f(λx1+(1−λ)x2)>λf(x1)+(1−λ)f(x2),则称 f ( x ) f(x) f(x)在区间 I \textbf{I} I上是严格上凸函数。
【定理5.1.6】【二阶导数与凸性的关系】
- 设 f ( x ) f(x) f(x)在区间 I \textbf{I} I上二阶可导,则 f ( x ) f(x) f(x)在 I \textbf{I} I下凸的充分必要条件是 f ′ ′ ( x ) ≥ 0 , ∀ x ∈ I f''(x)\ge 0,\forall x\in\textbf{I} f′′(x)≥0,∀x∈I.
- (充分条件) f ( x ) f(x) f(x)若在 I \textbf{I} I上有 f ′ ′ ( x ) > 0 f''(x)>0 f′′(x)>0,则 f ( x ) f(x) f(x)在 I \textbf{I} I上严格下凸。
- (充分条件) f ( x ) f(x) f(x)若在 I \textbf{I} I除去有限点 x 1 , x 2 , . . . , x n x_1,x_2,...,x_n x1,x2,...,xn后 f ′ ′ ( x ) > 0 f''(x)>0 f′′(x)>0,则 f ( x ) f(x) f(x)在 I \textbf{I} I上严格下凸。
【例】 f ( x ) = x 4 f(x)=x^4 f(x)=x4是严格下凸, f ′ ′ ( x ) = 12 x 2 f''(x)=12x^2 f′′(x)=12x2在 x = 0 x=0 x=0这一点二阶导数是 0 0 0. - 设 f ( x ) f(x) f(x)在区间 I \textbf{I} I上二阶可导,则 f ( x ) f(x) f(x)在 I \textbf{I} I上凸的充分必要条件是 f ′ ′ ( x ) ≤ 0 , ∀ x ∈ I f''(x)\le 0,\forall x\in\textbf{I} f′′(x)≤0,∀x∈I.
- (充分条件) f ( x ) f(x) f(x)若在 I \textbf{I} I上有 f ′ ′ ( x ) < 0 f''(x)<0 f′′(x)<0,则 f ( x ) f(x) f(x)在 I \textbf{I} I上严格上凸。
- (充分条件) f ( x ) f(x) f(x)若在 I \textbf{I} I除去有限点 x 1 , x 2 , . . . , x n x_1,x_2,...,x_n x1,x2,...,xn后 f ′ ′ ( x ) < 0 f''(x)<0 f′′(x)<0,则 f ( x ) f(x) f(x)在 I \textbf{I} I上严格上凸。
【几何直观理解】以下凸为例
切线在下方,随着 x x x的增加,切线斜率不断增加,也就是一阶导数单调增加,二阶导数大于等于0
【证】先证必要性,即证 f ( x ) f(x) f(x)下凸,则 f ′ ′ ( x ) ≥ 0 f''(x)\ge 0 f′′(x)≥0
取 λ = 1 2 \lambda = \frac{1}{2} λ=21,则 f ( x 1 + x 2 2 ) ≤ 1 2 ( f ( x 1 ) + f ( x 2 ) ) f(\frac{x_1+x_2}{2})\le \frac{1}{2}(f(x_1)+f(x_2)) f(2x1+x2)≤21(f(x1)+f(x2))
即 2 f ( x 1 + x 2 2 ) ≤ f ( x 1 ) + f ( x 2 ) 2f(\frac{x_1+x_2}{2})\le f(x_1)+f(x_2) 2f(2x1+x2)≤f(x1)+f(x2)
即 f ( x 2 ) − f ( x 1 + x 2 2 ) ≥ f ( x 1 + x 2 2 ) − f ( x 1 ) f(x_2)-f(\frac{x_1+x_2}{2})\ge f(\frac{x_1+x_2}{2})-f(x_1) f(x2)−f(2x1+x2)≥f(2x1+x2)−f(x1)
令 x 2 − x 1 = n Δ x n , Δ x n = f r a c x 2 − x 1 n x_2-x_1=n\Delta x_n,\Delta x_n=frac{x_2-x_1}{n} x2−x1=nΔxn,Δxn=fracx2−x1n,即将区间分了 n n n等份,
f ( x 2 ) − f ( x 2 − Δ x n ) ≥ f ( x 2 − Δ x n ) − f ( x 2 − 2 Δ x n ) ≥ f ( x 2 − 2 Δ x n ) − f ( x 2 − 3 Δ x n ) ≥ . . . ≥ f ( x 1 + Δ x n ) − f ( x 1 ) f(x_2)-f(x_2-\Delta x_n)\ge f(x_2-\Delta x_n)-f(x_2-2\Delta x_n)\ge f(x_2-2\Delta x_n)-f(x_2-3\Delta x_n)\ge ... \ge f(x_1+\Delta x_n)-f(x_1) f(x2)−f(x2−Δxn)≥f(x2−Δxn)−f(x2−2Δxn)≥f(x2−2Δxn)−f(x2−3Δxn)≥...≥f(x1+Δxn)−f(x1)
f ( x 2 + ( − Δ x n ) ) − f ( x 2 ) − 1 ≥ f ( x 1 + Δ x n ) − f ( x 1 ) 1 \frac{f(x_2+(-\Delta x_n))-f(x_2)}{-1}\ge\frac{f(x_1+\Delta x_n)-f(x_1)}{1} −1f(x2+(−Δxn))−f(x2)≥1f(x1+Δxn)−f(x1)
将左右两个等式同除 Δ x n \Delta x_n Δxn得
f ( x 2 + ( − Δ x n ) ) − f ( x 2 ) − Δ x n ≥ f ( x 1 + Δ x n ) − f ( x 1 ) Δ x n \frac{f(x_2+(-\Delta x_n))-f(x_2)}{-\Delta x_n}\ge\frac{f(x_1+\Delta x_n)-f(x_1)}{\Delta x_n} −Δxnf(x2+(−Δxn))−f(x2)≥Δxnf(x1+Δxn)−f(x1)…(1)
令 n → ∞ n\to \infty n→∞即 Δ x n → 0 \Delta x_n\to 0 Δxn→0
则不等式(1)变为 f ′ ( x 2 ) ≥ f ′ ( x 1 ) f'(x_2)\ge f'(x_1) f′(x2)≥f′(x1)
故 f ′ ( x ) f'(x) f′(x)单调增加
又 f ( x ) f(x) f(x)二阶可导,则 f ′ ′ ( x ) ≥ 0 f''(x)\ge 0 f′′(x)≥0
再证充分性,设 f ′ ′ ( x ) ≥ 0 f''(x)\ge 0 f′′(x)≥0,则 f ′ ( x ) f'(x) f′(x)单调增加
∀ x 1 , x 2 ∈ I , λ ∈ ( 0 , 1 ) \forall x_1,x_2\in\textbf{I},\lambda\in(0,1) ∀x1,x2∈I,λ∈(0,1),不妨设 x 1 < x 2 x_1<x_2 x1<x2
则 x 0 = λ x 1 + ( 1 − λ ) x 2 x_0=\lambda x_1+(1-\lambda)x_2 x0=λx1+(1−λ)x2
x 0 − x 1 = ( 1 − λ ) ( x 2 − x 1 ) x_0-x_1=(1-\lambda)(x_2-x_1) x0−x1=(1−λ)(x2−x1)
x 2 − x 0 = λ ( x 2 − x 1 ) x_2-x_0=\lambda(x_2-x_1) x2−x0=λ(x2−x1)
由拉格朗日中值定理可知
f ( x 0 ) − f ( x 1 ) = f ′ ( η 1 ) ( x 0 − x 1 ) , η 1 ∈ ( x 1 , x 0 ) f(x_0)-f(x_1)=f'(\eta_1)(x_0-x_1),\eta_1 \in(x_1,x_0) f(x0)−f(x1)=f′(η1)(x0−x1),η1∈(x1,x0)
亦即 f ( x 1 ) = f ( x 0 ) + f ′ ( η 1 ) ( x 1 − x 0 ) f(x_1)=f(x_0)+f'(\eta_1)(x_1-x_0) f(x1)=f(x0)+f′(η1)(x1−x0)
f ( x 2 ) − f ( x 0 ) = f ′ ( η 2 ) ( x 2 − x 0 ) , η 2 ∈ ( x 0 , x 2 ) f(x_2)-f(x_0)=f'(\eta_2)(x_2-x_0),\eta_2 \in(x_0,x_2) f(x2)−f(x0)=f′(η2)(x2−x0),η2∈(x0,x2)
亦即 f ( x 2 ) = f ( x 0 ) + f ′ ( η 2 ) ( x 2 − x 0 ) f(x_2)=f(x_0)+f'(\eta_2)(x_2-x_0) f(x2)=f(x0)+f′(η2)(x2−x0)
由于 f ′ ( x ) f'(x) f′(x)单调增加且 x 1 ≤ η 1 ≤ x 0 ≤ η 2 ≤ x 2 x_1\le \eta_1\le x_0\le\eta_2\le x_2 x1≤η1≤x0≤η2≤x2,又因为 x 1 − x 0 ≤ 0 , x 2 − x 0 ≥ 0 x_1-x_0\le 0,x_2-x_0\ge 0 x1−x0≤0,x2−x0≥0,还因为 x 0 − x 1 = ( 1 − λ ) ( x 2 − x 1 ) , x 2 − x 0 = λ ( x 2 − x 1 ) x_0-x_1=(1-\lambda)(x_2-x_1),x_2-x_0=\lambda(x_2-x_1) x0−x1=(1−λ)(x2−x1),x2−x0=λ(x2−x1)且 x 0 = x 1 + ( 1 − λ ) ( x 2 − x 1 ) x_0=x_1+(1-\lambda)(x_2-x_1) x0=x1+(1−λ)(x2−x1)
所以 f ( x 1 ) = f ( x 0 ) + f ′ ( η 1 ) ( x 1 − x 0 ) ≥ f ( x 0 ) + f ′ ( x 0 ) ( x 1 − x 0 ) = f ( x 0 ) − ( 1 − λ ) f ′ ( x 0 ) ( x 2 − x 1 ) f(x_1)=f(x_0)+f'(\eta_1)(x_1-x_0)\ge f(x_0)+f'(x_0)(x_1-x_0)=f(x_0)-(1-\lambda)f'(x_0)(x_2-x_1) f(x1)=f(x0)+f′(η1)(x1−x0)≥f(x0)+f′(x0)(x1−x0)=f(x0)−(1−λ)f′(x0)(x2−x1)…(1)
同理 f ( x 2 ) = f ( x 0 ) + f ′ ( η 2 ) ( x 2 − x 0 ) ≥ f ( x 0 ) + f ′ ( x 0 ) ( x 2 − x 0 ) = f ( x 0 ) + λ f ′ ( x 0 ) ( x 2 − x 1 ) f(x_2)=f(x_0)+f'(\eta_2)(x_2-x_0)\ge f(x_0)+f'(x_0)(x_2-x_0)=f(x_0)+\lambda f'(x_0)(x_2-x_1) f(x2)=f(x0)+f′(η2)(x2−x0)≥f(x0)+f′(x0)(x2−x0)=f(x0)+λf′(x0)(x2−x1)…(2)
( 1 ) × λ + ( 2 ) × ( 1 − λ ) (1)\times\lambda+(2)\times(1-\lambda) (1)×λ+(2)×(1−λ)得
λ f ( x 1 ) + ( 1 − λ ) f ( x 2 ) ≥ λ f ( x 0 ) + ( 1 − λ ) f ( x 0 ) − λ ( 1 − λ ) f ′ ( x 0 ) ( x 2 − x 1 ) + λ ( 1 − λ ) f ′ ( x 0 ) ( x 2 − x 1 ) = f ( x 0 ) = f ( x 1 + ( 1 − λ ) ( x 2 − x 1 ) ) \lambda f(x_1)+(1-\lambda)f(x_2)\ge\lambda f(x_0)+(1-\lambda)f(x_0)-\lambda(1-\lambda)f'(x_0)(x_2-x_1)+\lambda(1-\lambda)f'(x_0)(x_2-x_1)=f(x_0)=f(x_1+(1-\lambda)(x_2-x_1)) λf(x1)+(1−λ)f(x2)≥λf(x0)+(1−λ)f(x0)−λ(1−λ)f′(x0)(x2−x1)+λ(1−λ)f′(x0)(x2−x1)=f(x0)=f(x1+(1−λ)(x2−x1))
满足下凸定义,所以 f ( x ) f(x) f(x)在 I \textbf{I} I上下凸。