“Statistics is not perfect, but it is beautiful and exquisite! ”

Theoretical Derivation

This post is heavily focused on the mathematical proof and theoretical derivation, which doesn’t be discussed much in the “Statistics” Post.

Will be an appendix of it.

Gaussian Distribution

Normally, we have Gaussian Distribution: XN(μ,σ2).

 

We can now have (Xμ)N(0,σ2), which is symmetric odd-function.

 

As for the properties of PDF and CDF, we have:

12πσe(xμ)22σ2dx=1xμ2πσe(xμ)22σ2dx=0

Gaussian Probability Tables and Quantiles

 

Means of Gaussians

For E[x]=μ,

E(x)=+xf(x)dx=+x2πδe(xμ)22δ2dx=+(xμ)+μ2πδe(xμ)22δ2dx=+xμ2πδe(xμ)22δ2dx0+μ+12πδe(xμ)22δ2dx1=μ

 

For E[x2]=σ2+μ2,

E(x2)=x212πσe(xμ)22σ2dx=(x+μ)(xμ)+μ22πσe(xμ)22σ2dx=(x+μ)(xμ)2πσe(xμ)22σ2dx+μ22πσe(xμ)22σ2dx=σ(x+μ)2πd(e(xμ)22σ2)+μ2=(σ(x+μ)2πe(xμ)22σ2)0+σ21σ2πe(xμ)22σ2dx+μ2=σ2+μ2

 

For E[x3]=3μσ2+μ3,

Patch up x3 into (xμ)3,

x3=(xμ)3+μ3+3μx23xμ2

So we have, substitutes with E[x2] and E[x] formulas:

E(x3)=x312πσe(xμ)22σ2dx=(xμ)3+μ3+3μx23xμ22πσe(xμ)22σ2dx=(xμ)32πσe(xμ)22σ2dx0+3(μx2xμ2)2πσe(xμ)22σ2dx+μ3=0+3(μE(x2)μ2E(x))+μ3=3μσ2+μ3

For E[x4]=3σ4+6μ2σ2+μ4,

Patch up x4,

x4=(x2+μ2)(x+μ)(xμ)+μ4

So we have,

E(x4)=x412πσe(xμ)22σ2dx=(x2+μ2)(x+μ)(xμ)2πσe(xμ)22σ2dx+μ4=(x2+μ2)(x+μ)22πσe(xμ)22σ2d(xμ)2+μ4=σ(x2+μ2)(x+μ)2πde(xμ)22σ2+μ4=(σ(x2+μ2)(x+μ)2πe(xμ)22σ2)0+e(xμ)22σ2d(x2+μ2)(x+μ)22πσ+μ4=0+σ2(3E(x2)+2μE(x)+μ2)+μ4=3σ4+6μ2σ2+μ4

For E[X¯n]=E[X1],

E[X¯n]=1ni=1nE[Xi]=E[X1]

[eq2]

For E[XY],

The expectation of the product of X and Y is the product of the individual expectations: E(XY ) = E(X)E(Y ).

 

Variance of Gaussians

For V[x]=σ2,

With integral formulas step by step, we can get:

V=+12πδe(xu)22δ2(xu)2dx=+12πδex22δ2x2dx=δ2π+xd(ex22δ2)=δ2π(xex22δ2|++ex22δ2dx)=δ2π+ex22δ2dx=δ2

 

Which can be simplified with expectation substitutions we concluded above:

 

V(x)=E((xE(x))2)=E(x22xE(x)+E2(x))=E(x2)2E(x)E(x)+E2(x)=E(x2)E2(x)=E(x2)μ2=σ2

 

For V[x2]=2σ4+4σ2μ2,

V(x2)=E(x4)E(x2)2=3σ4+6μ2σ2+μ4(σ2+μ2)2=2σ4+4σ2μ24

 

V(X¯n)=σ2n

[eq3]

Var(XY)

image-20220301113018384

image-20220301113212523

image-20220329155656760

Covariance

Normally, we have standard covariance formula between variable x and y,

Cov(x,y)=E[ (xE[x]) (yE[y]) ]

image-20220205160543840

image-20220213171530998

pdf

You should have something like:

E(|XY|a)=x>y(xy)adxdy+y>x(yx)adxdy=2x>y(xy)adxdy

Now:

x>y(xy)adxdy=y=01x=y1(xy)adxdy=1(a+1)(a+2)

So:

E(|XY|a)=2x>y(xy)adxdy=2(a+1)(a+2)
Covariance of Gaussians

Recall that the covariance of two random variables X and Y denoted by Cov(X,Y) is defined as:

Cov(X,Y)=E[ (XE[X]) (YE[Y]) ]Cov(X,Y)=E[XY]E[X]E[Y]) ]

Var[X+Y] = Var[X] + Var[Y] + 2∙Cov[X,Y]

Var(XY)=Var[E(XYX)]+E[Var(XYX)]=Var[XE(YX)]+E[X2Var(YX)]=Var[XE(Y)]+E[X2Var(Y)]=E(Y)2Var(X)+Var(Y)E(X2)

If the covariance between two random variables is 0, then they are independent?

False, the criterion for independence is F(x,y)=FX(x)FY(y)

variance of ln function

image-20220216223016542

Var[X] is known, how to calculate Var[1x]:

Using Delta Method,

You can use Taylor series to get an approximation of the low order moments of a transformed random variable. If the distribution is fairly 'tight' around the mean (in a particular sense), the approximation can be pretty good.

g(X)=g(μ)+(Xμ)g(μ)+(Xμ)22g(μ)+

SO

Var[g(X)]=Var[g(μ)+(Xμ)g(μ)+(Xμ)22g(μ)+]=Var[(Xμ)g(μ)+(Xμ)22g(μ)+]=g(μ)2Var[(Xμ)]+2g(μ)Cov[(Xμ),(Xμ)22g(μ)+]+Var[(Xμ)22g(μ)+]

often only the first term is taken

Var[g(X)]g(μ)2Var(X)

In this case (assuming I didn't make a mistake), with g(X)=1X,Var[1X]1μ4Var(X).

 

The Property of Maximum

Given the IID r.v. XUni(0,1), we have Mn=max(x1,,xn),

Then, because Xn are i.i.d. :

P(Mn<t)=P(i=1n{xit})=i=qnP(xit)FX(t)n

whereFX() is the CDF of distribution.

P(n(1Mn)t)=P(1Mntn)=P(Mn1tn)=1P(Mn<1tn)=1(1tn)nn1et.

Limitations

for anyconstant c:

limn(1cn)nec

 

i=1αi=α1α,iff |α|<1

Food for thought:

 

What is the variance of the maximum of a sample?

 

Maximum of uniform random variables

Mn

b = - (2*barX_n + 1.6448^2/n)

(2barX_n + 1.6448^2/n + sqrt((2barX_n + 1.6448^2/n) &2-4*barX_n^2) )/2

Mode of Convergence Examples

Example of a.s.

Let UUni(0,1), Xn=U+Un (raised to the power of n)

Claim: Xna.sU

Proof:

With law of total probability, for any event A,

P(A)=P(AS1)P(S1)+P(AS2)P(S2)
{XnnU if U(0,1)Xnn2 if U=1

So we have

P(XnnU)=P(XnU)P(U<1)+0=1

Example 1 of P ,

Let XniidBer(1n) and ϵ(0,1), we have

P(Xn>ϵ)=P(Xn=1)=1nn0

Example 2 of P ,

Let UUni(0,1), Xi=minXi

Claim:X(1)pO,

Proof:

Fix ε>0,

P(|X(1)0|>ε)=P(X(1)>ε)=P(Xi>ε,i)=(P(X1>ε))n=(Sε1dx)n=(1ε)nn0

Example of complying P but not with a.s.

Let UUni(0,1), x is more and more subdivided segmentation functions between [0, 1]:

x1=U+1(U[0,1])=U+1x2=U+1(U[0,1/2])x3=U+1(U[1/2,1])x4=U+1(U[0,1/3])x5=U+1(U[1/3,2/3])x6=U+1(U[2/3,1])

Claim 1: xnpU:Fix 0<ε<1

P(|xnU|>ε)=P(U[an,bn])=bnann0

Claim 2: xna.s.U:

P(xnx)1xn has not limis

Quadratic equation

image-20220307210646804