Loading [MathJax]/jax/output/HTML-CSS/jax.js

    Home          Downloads          SPD on the web          Links

[← Non-linear TRM]     [↑ SPD on the web]      [Calibration data set →]

10 The statistics of multiple paleointensity estimates

Averaging and weighting

Statistic: N

The number of paleointensity estimates to be analyzed.

 

Statistic: Bj

The value of the jth paleointensity estimate, where j=1 to N.

 

Statistic: m
Report to 1 d.p.

The arithmetic mean of the N paleointensity estimates m=Nj=1BjN.

 

Statistic: s
Report to 1 d.p.

The standard deviation of the N paleointensity estimates. s=(Nj=1(Bjm)2N1)12

 

Statistic: mw
Report to 1 d.p.

The weighted mean of the N paleointensity estimates. mw=Nj=1WjBjNj=1Wj, where Wj is the weight on the jth paleointensity estimate.

 

Statistic: sw
Report to 1 d.p.

The weighted standard deviation of the N paleointensity estimates (Heckert and Filliben, 2003). sw=(NNj=1Wj(Bjmw)2(N1)Nj=1Wj)12

Useful Note...
Several options are available to act as weights. Two options that have been used in the literature are the quality and weighting factors, q and w, respectively. Their use as weighting factors, however, are not appropriate. As is outlined in Section 3, wq, which itself is a function of the Arai plot slope (|b|). Hence, both q and w are proportional to the paleointensity estimate. If q or w are used as a weight (Wj) then WjBj (i.e., higher paleointensity estimates will tend to have larger weights), which can bias the weighted mean to higher values. Such dependencies should be carefully considered when deciding on the choice of which statistic to use for weighting.

Measures of scatter

Statistic: δB(%)
Report to 1 d.p.

The standard deviation as a percentage of the mean value. Often referred to as the scatter. δB(%)=sm×100

 

Statistic: δBN(%)
Report to 1 d.p.

When dealing with small numbers of data (i.e., small N), both m and s are inherently uncertain and these uncertainties propagate into measures of scatter. To account for this, Paterson et al. (2010a) proposed an adjustment to δB(%) to determine the upper 95% confidence interval (δBN(%)). Using this approach we can say, with 95% confidence, that the true scatter of the data is less than δBN(%). This allows for a fairer comparison of data sets with different N. δBN(%)=|Ntnc(1α; (N1); mNs)|×100, where tnc is the noncentral t critical value for the (1α) confidence level for (N1) degrees of freedom and with noncentrality parameter mNs.

Numerical Tip...
Different software packages use different conventions for the input of (1α) into the calculation of the noncentral t critical value. For example, the MATLAB command nctinv() takes α=0.95, while other function may use α=0.05. For N1=1 and a noncentrality parameter of unity (1) the noncentral t critical value at the 95% confidence level is -1.193.

Statistical tests for scatter

Statistic: pδB
Report to 3 d.p.

An alternative approach is to determine the probability that the scatter (i.e., δB(%)) is less than some critical value, δBmax (Paterson et al., 2010a). By adopting this approach, selection based on scatter can be performed as a statistical test, whereby we test the null hypothesis that our measured scatter is less than or equal to δBmax. The probability pδB that this is the case is given by pδB=F(NδBmax; (N1) ; mNs), where F() is the noncentral t cumulative distribution function and δBmax is given as a fraction and not a percentage (e.g., 0.25 as opposed to 25%). If pδB0.05 we cannot reject the null hypothesis that our measured scatter is less than or equal to δBmax (at the 5% significance level). If, however, pδB>0.05 we can reject the null hypotheses and our measured scatter is most likely greater than δBmax. The two outlined approaches are identical, with δBN(%) being the value of δBmax that yields pδB=0.05.

Statistic: ps
Report to 3 d.p.

Some studies prefer to select data using an absolute limit on the standard deviation (smax) of an average paleointensity estimate, most notably when the estimate is low and the relative scatter may therefore be high. Given that, under the assumption of normality, estimated variance follows a chi-squared distribution the probability that s is less than or equal to smax is given by ps=Fχ2((N1)s2maxs2; (N1)), where Fχ2() chi-squared cumulative distribution function with N1 degrees of freedom. If ps0.05 we cannot reject the null hypothesis that our measured scatter is less than or equal to smax (at the 5% significance level). If, however, ps>0.05 we can reject the null hypotheses and our measured scatter is most likely greater than smax.
It should be noted that, in cases where δBmax=smaxm, ps is always less than pδB. This is due to fact that pδB accounts for sample size related uncertainty in both m and s, but ps accounts for sample size uncertainty in only s.

 

↑ TOP