Weighted averaging

Derivation of formulae
for weighted averaging

(Document prepared in connection with this report)

Define weighted mean or average as:

y = x₁ w₁ + x₂ w₂ + ... + x_N w_N =

N
Σ
i = 1

x_i w_i,

(1)

where w_i are weights and x_i are arithmetic averages themeselves. Assume all x_i come from averaging of the same stationary normal (Gaussian) process with mean x and standard deviation sigma (σ), and the averaging is done over the same number of samples. This means that expected value of the standard deviation of each of x_i is the same and is √N times higher than σ:

σ_i = √N σ.

Our derivation is nevertheless carried out as if all σ_i were different since we would like to relate them to the measured variances, which are different from each other.
The expected value of the weighted mean is

< y > = < Σ x_i w_i > = Σ w_i < x_i > = x Σ w_i.

Thus, in order that y will represent an unbiased estimate of the mean value we need this condition:

Σw_i = 1.

The quantity y has the variance:

v = < (y – < y > )² > = < y² > – < y > ².

We have < y² > = < (Σ x_i w_i)² > = < Σ x_i x_j w_i w_j > = Σ w_i w_j < x_i x_j > . Noting that < x_i x_j > = < (x_i – x + x)( x_j – x + x) > and < x_i – x > = 0 while < x_i – x > < x_j – x > is nonzero only for i = j, when it assumes the value of expected variance of the i-th mean, v_i = σ_i², we straightforwardly arrive at

< y² > = x² Σ w_i w_j + Σ σ_i² w_i²,

so that the expected variance of y is

x² Σ w_i w_j + Σ σ_i²w_i² – (x Σ w_i)² or just

v = Σ σ_i² w_i² = Σ v_i w_i².

(2)

This is quite general formula for the variance of weighted mean. Now we define the following weighting:

w_i =

1/v_i

Σ1/v_i

(3)

where v_i are sample variances of real measurements. These weights satisfy the condition of summing up to 1. Inserting them into the expression of Eq. (2) we obtain v = Σ v_i[(1/v_i)/Σ (1/v_i)]² = (Σ v_i/v_i²) / (Σ 1/v_i)², that reduces to this very simple equation

v =

Σ1/v_i

(4)

This formula is known in literature in the form (see e.g. Wikipedia):

v =

Σ1/σ_i²

A hastily made search for derivation of this expression turned out unsuccessful therefore we present it here in this small document. An attentive reader might have noted an inconsistency in the above derivation of Eq. (4). Namely, the variances in Eq. (3) assume the sample values while those in Eq. (2) are the expected values! One might have substituted in Eq. (2) N σ² for v_i to easily arrive at more proper final formula:

v = Nσ² Σ(1/v_i)²/(Σ 1/v_i)².

Unfortunately such a formula is much more complex and would require computation of some approximation to σ (could be just the rms of all the data). Sticking to this less strict solution of Eq. (4), we believe, should not cause any problems in our VLBI practice since ultimately the measured values converge to those expected and the error estimate need not to exactly reflect the true dispersion. It can be also shown (see this box) that the same formula obtains by the Gaussian error propagation method (valid for small errors).

Earlier we have assumed v_i = N σ² so that if the data are not contaminated we can expect Eq. (4) to lead to the variance of 1/[Σ 1/(N σ²)] = σ², i.e. the same as while calculating the arithmetic (not weighted) mean of the same data.

To sum up, the variance of the weighted mean computed according to Eqs. (1) and (3) can be calculated according to Eq. (4), thus the error estimate of such a mean is

SIG =

( Σ1/v_i )^1/2

(5)

In case of N = 2 and our weighting mathod of Eq. (3) equtions for the mean and variance simplify to

x =

x₁ v₂ + x₂ v₁

v₁ + v₂

(6)

v = SIG² =

v₁ v₂

v₁ + v₂

(7)

Approximation by error propagation method

Suppose the dispersions σ_i of our means are small, i.e. measurement errors Δx_i in x_i are small. General formula for calculation of error in weighted mean y is in this case (provided errors in the x_i samples are completely independent):

Δy =

√

(

∂y

∂x_i

Δx_i

)

Since (∂y/∂x_i) Δx_i = w_i Δx_i = [(1/σ_i²) / (Σ1/σ_j²)] Δx_i by substituting Δx_i = σ_i we obtain

Δy =

√

(

1/σ_i²

Σ1/σ_j²

σ_i

)

√

Σ1/σ_j²

(

v_j

)

–1/2

which is the same as Eq. (5).

— K.M. Borkowski

Posted: May 29, 2007; last updated: June 11, 2007

File translated from T_EX by T_TH, version 3.77, on 25 May 2007.