Methodology

 

We started by plotting the measurements over time at several sites.  It was visually apparent that some sites showed dramatic increases and others, dramatic decreases.  To capture this phenomenon, we used three statistical tests: 

1.       T-test comparing mean measurement values before and after a fixed cutoff date;

2.       Mann-Kendall nonparametric test for trend;

3.       Sen’s nonparametric estimate of slope.

 

1)       T-test

At each site, for each variable, we computed the mean value before and after a fixed cutoff date and applied a (two-sided) T test for equality of means from populations of different variances.  The cutoff date was the same for all sites.

 

The choice of cutoff date was, of course, important.  We tried to choose a date that would minimize the effect of known seasonal variations.  Soil vapor measurements are affected by both temperature and soil moisture, and thus fluctuate by seasons. We elected to compare changes before and after July 1, 1998—one year before the latest data available for this preliminary report.  The hope is that by including at least one year of data in each population, the effect of seasonal variations would be minimized as much as possible.   Some trial calculations convinced us that, in fact, the results described below are relatively insensitive to the choice of cutoff date.

 

At each site with at least two measurements before and two measurements after the cutoff date, we computed separately the number, the mean, and the variance of measurements before and after the cutoff date.   Considering the measurements before and after the cutoff date as separate populations with unequal variance, we computed, using a two-sided T test, the probability of equal means.

 

Before applying the T test, adjustments were made to the data to compensate for a change in measuring instrument.  For reasons explained below, it also seemed advisable to try applying the T test to the logarithm of one variable; however, this adjustment did not significantly affect the results.

 

2)       Sen’s nonparametric estimator

At each site, for each variable, we listed all the pairs of dates for which a measurement was available.  For every pair of dates, we computed the slope in units per day as

 

reading2 – reading1  .

date2 – date1

 

Sen’s estimator is the median of all such slopes.  We computed approximate two-sided 95% and 99% confidence intervals, using a variant of the procedure given in [Gilbert, 1987].  The procedure assumes approximate normality.  We tried to ensure this by only computing the Sen’s estimator for sites where ten or more observations were available.  Our variant of Gilbert’s procedure was to assume that all tied groups had size exactly two.  The net effect of this assumption is conservative; some of the confidence intervals may be wider than necessary.  On the other hand, the assumption is probably reasonable, since the precision of the instrument is high enough that three or more exactly identical readings at different times are unlikely.

 

Sen gives a numerical estimate of the amount of increase or decrease in units per day at a site.  We recorded, but did not use this feature in the current study.  We merely report the number of sites where the 99% and 95% Sen confidence intervals do not contain zero (i.e., the sites where there is a high probability of a trend).

 

3)       Mann-Kendall test for trend

The Mann-Kendall statistic is similar to Sen’s estimator.  At each site, for each pair of dates, compute the slope as above, but only consider the sign of the slope:

 

                                                +1        if           reading2 > reading1;

Sign of slope =                0        if           reading2 = reading1;

 -1        if           reading2 < reading1.

 

The Mann-Kendall statistic is the sum of all signs of slopes.  Using a procedure in [Gilbert, 1987], we computed standardized scores and two-tailed confidence intervals for the Mann-Kendall number at each site.  These standardized scores can be used to compute a chi-square test for homogeneity of trend across multiple sites; the latter test was actually of little use in this case since little spatial homogeneity was apparent.  Computation of standardized scores assumes approximate normality, which, as in the Sen estimate, we attempted to ensure by only considering sites with ten or more measurements.