Processing math: 100%

The K distribution

R. Wayne Oldford

Definition

Suppose Xχ2m is a Chi-squared random variate on m degrees of freedom. Then the distribution of Y=Xm is the Kay distribution on m degrees of freedom, written as YKm. Its density is f(y)={mm2ym1e12my22m21Γ(m2)   for  0y<0otherwise.

The Km density has some very attractive features over the χ2m density:

As m increases, Km has better properties

As m increases, Km has better properties

These values were calculated using the dkay(...) density function. For example, dkay(1.0, df=10) = 1.7546737.

Normal theory relations

Perhaps the most obvious relation between a normal random variate and a Km is that if ZN(0,1), then |Z|K1, the half-normal.

More important in applications is that distribution of the estimator of the sample standard deviation is proportional to a Km. To be precise, if Y1,,Yn are independent and identically distributed as N(μ,σ2) random variates, with realizations y1,,yn and the usual estimates ˆμ=yi/n and ˆσ=(yiˆμ)2/(n1), then the corresponding estimators ˜μ and ˜σ are distributed as ˜μN(μ,σ2n)     and     ˜σσKn1. The latter shows that Km is used for inference (e.g. tests and confidence intervals) about σ.

This is handy because the Km quantiles vary much less than do those of χ2m. For example, condider the following table of the cumulative distribution.

df p=0.05 p=0.5 p=0.95
1 0.0627068 0.6744898 1.959964
2 0.2264802 0.8325546 1.730818
3 0.3424648 0.8880642 1.613973
4 0.4215220 0.9160641 1.540108
5 0.4786390 0.9328944 1.487985
6 0.5220764 0.9441152 1.448654
7 0.5564364 0.9521263 1.417601
8 0.5844481 0.9581311 1.392269
9 0.6078297 0.9627987 1.371090
10 0.6277180 0.9665308 1.353035
15 0.6957463 0.9777136 1.290886
20 0.7365735 0.9832962 1.253205
25 0.7644974 0.9866425 1.227232
30 0.7851255 0.9888719 1.207932
35 0.8011601 0.9904636 1.192858
40 0.8140839 0.9916570 1.180662

Unlike the χ2m distribution, the quantiles in this table stabilize, allowing 1±0.20 being not a bad rule of thumb for a 90% probability of the ratio ˜σ/σ.

These values were calculated using the qkay(...) quantile function. For example, qkay(0.05, df=5) = 0.478639. These would be used to construct interval estimates for σ.

To get observed significance levels, the cumulative distribution function pkay(...) would be used. For example, SL = 1- pkay(1.4, df=10) = 1 - 0.9667287 = 0.0332713.

The Student t distribution

For the standard normal theory, the Student tm distribution can be defined as follows. If ZN(0,1) and YKm is distributed independently of Z, then the ratio T=ZY=N(0,1)Km=tm which is fairly easy to remember.

For the estimators from the above model ˜μμ˜σ/n=˜μμσ/n˜σσ=N(0,1)Kn1=tn1 is used to construct interval estimates and tests for the value of the parameter μ.

The functions

As with every other distribution in R four functions are provided for the Km distribution. These are

The parameters in the ellipsis include a non-centrality parameter. All functions rely on the corresponding χ2m functions in base R.

We briefly illustrate each below.

The density dkay(x, df, ...)

The cumulative distribution function pkay(x, df, ...)

The quantile function qkay(p, df, ...)

Pseudo-random realizations rkay(n, df, ...)