Derivation of formulae
for weighted averaging
Define weighted mean or average as:
y = x1 w1 + x2 w2 + ... + xN wN = |
N Σ
i = 1
|
xi wi, |
| (1) |
where wi are weights and xi are arithmetic averages themeselves.
Assume all xi come from averaging of the same stationary normal (Gaussian)
process with mean x and standard deviation sigma (σ),
and the averaging is done over the same number of samples. This means that expected
value of the standard deviation of each of xi is the same and is
√N times higher than σ:
Our derivation is nevertheless carried out as if all
σi were different since we would like to relate
them to the measured variances, which are different from each other.
The expected value of the weighted mean is
< y > = < Σ xi
wi > = Σ wi
< xi > = x Σ wi. |
|
Thus, in order that y will represent an unbiased estimate of the mean value
we need this condition:
The quantity y has the variance:
v = < (y –
< y > )2 > = < y2 >
– < y > 2. |
|
We have < y2 > = <
(Σ xi wi)2 >
= < Σ xi xj wi
wj > = Σ wi wj
< xi xj > . Noting that < xi xj > = < (xi –
x + x)( xj –
x + x) > and < xi –
x > = 0 while
< xi –
x > < xj – x >
is nonzero only for i = j, when it assumes the value of expected variance of
the i-th mean, vi = σi2,
we straightforwardly arrive at
< y2 > = x2 Σ wi
wj + Σ σi2 wi2,
so that the expected variance of y is
x2 Σ wi wj +
Σ σi2wi2 –
(x Σ wi)2 or just
v = Σ σi2 wi2 = Σ vi wi2. |
| (2) |
This is quite general formula for the variance of weighted mean.
Now we define the following weighting:
where vi are sample variances of real measurements. These weights satisfy
the condition of summing up to 1.
Inserting them into the expression of Eq. (2) we obtain
v = Σ
vi[(1/vi)/Σ
(1/vi)]2 = (Σ
vi/vi2) / (Σ
1/vi)2,
that reduces to this very simple equation
This formula is known in literature in the form
(see e.g. Wikipedia):
A hastily made search for derivation of this expression turned out unsuccessful
therefore we present it here in this small document. An attentive reader might
have noted an inconsistency in the above derivation of Eq. (4).
Namely, the variances in Eq. (3) assume
the sample values while those in Eq. (2) are the expected values! One might have
substituted in Eq. (2) N σ2
for vi to easily arrive at more proper final formula:
v = Nσ2
Σ(1/vi)2/(Σ
1/vi)2.
Unfortunately such a formula is much more complex and would require computation
of some approximation to σ (could be just the rms of
all the data). Sticking to this less
strict solution of Eq. (4), we believe, should not cause any problems in our
VLBI practice since ultimately the measured values converge to those expected
and the error estimate need not to exactly reflect the true dispersion.
It can be also shown (see this box) that the same formula
obtains by the Gaussian error propagation method (valid for small errors).
Earlier we have assumed vi = N σ2
so that if the data are not contaminated
we can expect Eq. (4) to lead to the variance of
1/[Σ 1/(N σ2)] = σ2,
i.e. the same as while calculating the arithmetic (not weighted) mean of the same data.
To sum up, the variance of the weighted mean computed according to Eqs. (1) and (3)
can be calculated according to Eq. (4), thus the error estimate of such a mean is
In case of N = 2 and our weighting mathod of Eq. (3) equtions for
the mean and variance simplify to
x = |
x1 v2 + x2 v1
v1 + v2
|
. |
| (6) |
v = SIG2 = |
v1 v2
v1 + v2
|
. |
| (7) |
Approximation by error propagation method
Suppose the dispersions σi of our means
are small, i.e. measurement errors Δxi
in xi are small. General formula for calculation of error
in weighted mean y is in this case (provided errors in the xi
samples are completely independent):
Since (∂y/∂xi) Δxi
= wi Δxi = [(1/σi2) /
(Σ1/σj2)] Δxi by substituting Δxi = σi
we obtain
Δy = |
√ |
|
= |
1
|
= | ( |
Σ
| |
1
vj
| ) |
–1/2
|
, |
|
which is the same as Eq. (5).
|
— K.M. Borkowski
Posted: May 29, 2007; last updated: June 11, 2007
File translated from
TEX
by
TTH,
version 3.77, on 25 May 2007.