is there simple guideline or tool that guides how to combine multiple an independent surveys into one survey that represent the wide geographic area? in a given province, five surveys (each 30by30) were conducted. During the analysis, at provincial level, we want to combine them by weighting in view of the district population size differences.
You have to be careful doing this as you may end up hiding variation behind a rather meaningless average. It is almost always more useful to present per district results if you can (a map is best as there may be some clear spatial pattern that will not be so clear in a table) rather than a single wider-area average. I think you could do both (i.e. present per-district estimates and then per-proving summary estimate.
The first thing to do it to check that it make much sense to combine all the surveys in order to give a single result. This is only the case when the estimates from each survey are similar to each other. This can be as simple as a visual check using a "forest plot" of estimates and 95% CIs.
Here is an example of similarity:
Survey 1 |-----*-----------|
Survey 2 |-----*----------|
Survey 3 |------*------------|
...
Survey N |-----*-----------|
+--+--+--+--+--+--+--+--+--+--+--+--+--+
8 9 10 11 12 13 14 15 16 17 18 19 20 12
Prevalence (%)
Note that the point estimates (marked by the "*") are close to each other and there is a lot of overlap of the 95% CIs. In this case an average will have meaning.
Here is an example of dissimilarity:
Survey 1 |-----*--------|
Survey 2 |-----*----------|
Survey 3 |------*---------|
...
Survey N |---*------|
+--+--+--+--+--+--+--+--+--+--+--+--+--+
8 9 10 11 12 13 14 15 16 17 18 19 20 12
Prevalence (%)
Note that the point estimates are widely spread and some of the CIs do not overlap much or at all. In this case an average will hide variation and would best be avoided.
The pooled proportion is a population weighted average of the proportions found by each survey:
p1 * w1 + p2 * w2 + ... + pn * wn
Pooled proportion = ---------------------------------
w1 + w2 + wn
where:
p1 = proportion from survey 1
p2 = proportion from survey 2
.
. and so-on
.
w1 = population in area for survey 1
w2 = population in area for survey 3
.
. and so-on
.
Complications arise when trying to pool variances. This is because the survey samples are complex and the variance is influenced by the proportion, the sample size, and the survey design effect.
One way to approach this problem is to calculate the standard error (SE) from the estimates and 95% CIs reported from each survey:
Upper Confidence Limit - Lower Confidence Limit
SE = -----------------------------------------------
2 * 1.96
The pooled SE is:
( SE1^2 * w1 + SE2^2 * w2 + ... SEn^2 * wn )
Pooled SE = sqrt( ----------------------------------------- )
( w1 + w2 + wn )
where:
SE1 = SE for survey 1
SE2 = SE for survey 2
.
. and so-on
.
w1 = population in area for survey 1
w2 = population in area for survey 3
.
. and so-on
.
The pooled estimate is:
Pooled estimate = Pooled proportion +/- 1.96 * Pooled SE
Here is an example with three surveys only ... the survey results are:
Survey Population p LCL UCL
-------- ---------- ----- ----- -----
Survey 1 23,670 12.7% 9.7% 16.1%
Survey 2 16,546 9.3% 6.3% 13.2%
Survey 3 19,201 13.5% 9.8% 18.0%
-------- ---------- ----- ----- -----
The pooled proportion is:
Survey w p p * w
-------- ------ ----- -----
Survey 1 23,670 0.127 3,006
Survey 2 16,546 0.099 1,638
Survey 3 19,201 0.135 2,592
-------- ------ ----- -----
Sum 59,417 7,236
Pooled proportion = 7236 / 59417 = 0.122
The pooled SE is:
Survey w LCL UCL SE SE^2 SE^2 * w
-------- ------ ----- ----- ----- -------- ---------
Survey 1 23,670 0.097 0.161 0.016 0.000256 6.059520
Survey 2 16,546 0.063 0.132 0.018 0.000324 5.360904
Survey 3 19,201 0.098 0.180 0.021 0.000441 8.467641
-------- ------ ----- ----- ----- -------- ---------
Sum 59,417 19.888070
Pooled SE = sqrt(19.88070 / 59417) = 0.0183
The pooled estimate is:
Point estimate = 0.122
95% LCL = 0.122 - 1.96 * 0.0183 = 0.086
95% UCL = 0.122 + 1.96 * 0.0183 = 0.158
or 12.2% (95% CI = 8.6% - 15.8%).
Important ...
(1) Someone should check my thinking and my arithmetic.
(2) When you do these sorts of calculation you should do them to the full precision throughout and only round at the end. I did not do this above so there will be some accumulated rounding error in the final result above.
I hope this is of some use.
Mark Myatt
Technical Expert
Answered:
9 years agoDear Mark,
Thank you very much for the detail response.
Anonymous
Answered:
9 years ago