Fisher信息量学习笔记

本文记录了Fisher 信息量的两种等价计算方式.

说明

本文积分以\int表示+\int_{-\infty}^{+\infty} (主要原因是懒,请谅解.)

Fisher 信息量

定义

设总体概率密度函数p(x;θ),θΘp(x;\theta),\theta \in \mathbb{\Theta}满足下列条件:

  1. 参数空间Θ\Theta是直线上的一个开区间;
  2. 支撑S={xp(x;θ)}S=\{x|p(x;\theta)\}θ\theta无关;
  3. 导数θp(x;θ)\dfrac{\partial }{\partial \theta}p(x;\theta)对一切θΘ\theta \in \Theta都存在;
  4. p(x;θ)p(x;\theta),积分与微分可以交换顺序,即:

θp(x;θ)dx=θp(x;θ)dx\dfrac{\partial}{\partial \theta}\int p(x;\theta)\,\mathrm{d}x=\int \dfrac{\partial}{\partial \theta}p(x;\theta)\,\mathrm{d}x

  1. 期望E[θlnp(x;θ)]2E\left[\dfrac{\partial}{\partial \theta}\ln p(x;\theta)\right]^2存在.

则称

I(θ)=E[θlnp(x;θ)]2I(\theta)=E\left[\dfrac{\partial}{\partial \theta}\ln p(x;\theta)\right]^2

为总体分布的Fisher 信息量.

另一种表示法

Fisher信息量I(θ)I(\theta)也可以表示为

I(θ)=E[2θ2lnp(x;θ)]I(\theta)=-E\left[\dfrac{\partial ^2}{\partial \theta ^2}\ln p(x;\theta)\right]

有时候这种计算方法更加方便

证明

首先,有:

E[θlnp(x;θ)]=0\begin{equation} E\left[\dfrac{\partial}{\partial \theta}\ln p(x;\theta)\right]=0 \end{equation}

这是因为:

E[θlnp(x;θ)]=lnp(x;θ)θp(x;θ)dx=1p(x;θ)p(x;θ)θp(x;θ)dx=p(x;θ)θdx=θp(x;θ)dx=0\begin{aligned} &E\left[\dfrac{\partial}{\partial \theta}\ln p(x;\theta)\right]\\ =&\int \dfrac{\partial \ln p(x;\theta)}{\partial \theta} p(x;\theta)\, \mathrm{d}x \\ =&\int \dfrac{1}{p(x;\theta)}\dfrac{\partial p(x;\theta)}{\partial \theta} p(x;\theta)\, \mathrm{d}x\\ =&\int \dfrac{\partial p(x;\theta)}{\partial \theta} \, \mathrm{d}x\\ =&\dfrac{\partial }{\partial \theta}\int p(x;\theta)\, \mathrm{d}x\\ =&0 \end{aligned}

然后,有

θE[θlnp(x;θ)]=0\begin{equation} \dfrac{\partial }{\partial \theta}E\left[\dfrac{\partial}{\partial \theta}\ln p(x;\theta)\right]=0 \end{equation}

又因为:

lnp(x;θ)θ=1p(x;θ)p(x;θ)θ\dfrac{\partial \ln p(x;\theta)}{\partial \theta}=\frac{1}{p(x;\theta)}\frac{\partial p(x;\theta)}{\partial \theta}

所以

p(x;θ)θ=p(x;θ)lnp(x;θ)θ\begin{equation} \frac{\partial p(x;\theta)}{\partial \theta}=p(x;\theta) \dfrac{\partial \ln p(x;\theta)}{\partial \theta} \end{equation}

故,有:

θE[θlnp(x;θ)]=θlnp(x;θ)θp(x;θ)dx=θ(lnp(x;θ)θp(x;θ))dx=(p(x;θ)θlnp(x;θ)θ+p(x;θ)2lnp(x;θ)θ2)dx=p(x;θ)(lnp(x;θ)θ)2dx+p(x;θ)2lnp(x;θ)θ2dx=E[(lnp(x;θ)θ)2]+E[2lnp(x;θ)θ2]=0\begin{aligned} &\dfrac{\partial }{\partial \theta}E\left[\dfrac{\partial}{\partial \theta}\ln p(x;\theta)\right]\\ =& \dfrac{\partial }{\partial \theta}\int \dfrac{\partial \ln p(x;\theta)}{\partial \theta} p(x;\theta) \, \mathrm{d}x\\ =& \int \dfrac{\partial }{\partial \theta}\left(\dfrac{\partial \ln p(x;\theta)}{\partial \theta} p(x;\theta)\right) \, \mathrm{d}x\\ =& \int \left(\dfrac{\partial p(x;\theta) }{\partial \theta}\dfrac{\partial \ln p(x;\theta)}{\partial \theta}+ p(x;\theta)\dfrac{\partial^2 \ln p(x;\theta)}{\partial \theta^2}\right) \, \mathrm{d}x\\ =& \int p(x;\theta) \left(\dfrac{\partial \ln p(x;\theta)}{\partial \theta}\right)^2\, \mathrm{d}x % + \int p(x;\theta) \dfrac{\partial^2 \ln p(x;\theta)}{\partial \theta^2}\,\mathrm{d}x\\ =&E\left[\left(\dfrac{\partial \ln p(x;\theta)}{\partial \theta}\right)^2\right]+E\left[\dfrac{\partial^2 \ln p(x;\theta)}{\partial \theta^2}\right]\\ =&0 \end{aligned}

所以

E[(lnp(x;θ)θ)2]=E[2lnp(x;θ)θ2]E\left[\left(\dfrac{\partial \ln p(x;\theta)}{\partial \theta}\right)^2\right]=-E\left[\dfrac{\partial^2 \ln p(x;\theta)}{\partial \theta^2}\right]

即:

I(θ)=E[2lnp(x;θ)θ2]I(\theta)=-E\left[\dfrac{\partial^2 \ln p(x;\theta)}{\partial \theta^2}\right]


Fisher信息量学习笔记
https://blog.askk.cc/2018/04/20/Fisher-and-MLE/
作者
sukanka
发布于
2018年4月20日
许可协议