The current approach is to have single two-stage cluster survey by agro-ecology. If a given district has two agro-ecological zone, this means there should be two independent survey as those people living in those area are believed to be heterogeneous. FAO defines agro-ecological zone as "A land resource mapping unit, defined in terms of climate, landform and soils, and/or land cover, and having a specific range of potentials and constraints for land use". However, recently the word agro-ecological zone is being replaced by "Livelihood zone". In this case, In one agro-ecological zone there could be more than one livelihood zones. Now days it is difficult to do a rapid two-stage cluster survey in one district or provinces as there is critique of livelihood zones. To do a single survey in each livelihood zone (tiny villages) is time and money consuming. So, is there method to do one single survey in these multiple livelihood zone by accommodating things like increase sample size, increase cluster and increase design effect and the like? SMART guideline recomends to play around the design effect but it is not clear.
The idea behind needing different surveys in different "agro-ecological zones" ("livelihood zones", "food-economy zones" &c.) is a concern of how you would interpret the results of a survey. Imagine a world divided into two equal populations of "slum-dwellers" and "palace-dwellers". All of the children of the slum-dwellers are underweight and none of the children of the palace-dwellers are underweight. Survey both populations in one survey and the prevalence of underweight found will be close to 50%. The 50% is correct but there is nowhere you can go in this imagined world where the prevalence is anywhere near 50%. The only clue that you get from the survey results that this is the case is a large design effect (at least you have this in a cluster-sampled design ... with a simple random sample you have no clue). The problem is that you have two populations treated as one. There is no way that you can survey two populations with very different prevalences in a single survey and get a meaningful survey result regardless of how you fiddle with design effects. What you need is two surveys and each survey should have a sufficient sample size in terms of number of clusters and size of clusters (assuming a two-stage cluster sampled design) and that means you need about 30 clusters (you can probably get away with fewer but not many fewer) and an overall sample size of four or five hundred in each survey (if your looking at wasting and more if your looking at underweight or stunting as prevalence will be closer to 50%). It might make sense to survey to populations in a single survey if you had good reasons to believe that the two populations were similar with regard to prevalence. In that case the survey result will be close to the true prevalences in both populations. This "homogeneity assumption" is implicit in all wide-area single-estimate surveys. This is a big problem with all wide-area single-estimate surveys. The assumption of homogeneity of prevalence is problematic even in a single "agro-ecological zone". Wasting, for example, is strongly associated with infection and infection, by definition, is a clustered phenomena. This means that we can expect wasting to be clustered (heterogeneous). I think that we should be looking to develop methods that can accomodate clustering as is done (e.g.) for infectious diseases such as trachoma, onchcerciasis, and filariasis. Such methods would produce a prevalence map showing where prevalence is likely to be high and where prevalence is likely to be low. This is not a straightforward problem. I'd guess that such a method would use a spatial sample in the first stage (or use multiple sampling stages to search for prevalence boundaries or contours) and may use continuous data and a relevant probability distribution to estimate prevalence (as suggested by Mike Golden) or sequential sampling techniques to classify prevalence at each sampling point. Proximity sampling would need to be abandoned but smapling methods such as EPI3 and EPI5 maight do the job and are not difficult or time consuming to use. This may seem difficult but such rapid epidemiological mapping methods for wasting were used in the Biafran Emergencency at the end of the 1960s.
Mark Myatt
Technical Expert

Answered:

15 years ago
Mark is correct. The point about design effect is important though. IF the population is homogeneous one can use a much lower design effect in estimating the sample size needed (analysis of camp data for example gives design effects from 1 to about 1.5) for non -camp data the design effects are between 1.2 and 1.7 where we think the population is homogeneous - this goes up to 2 or more when the population is heterogeneous. I have not yet finished this analysis of design effects so take these figures as very preliminary. The point is that the TOTAL number of children surveyed in two surveys with small design effects is not much more than one survey with a large design effect - so the work in 2 surveys is not twice that of one more wide ranging survey. In SMART beta, we have built in a function to examine the heterogeneity. The number of cases in each of the clusters is examined and compared with a Poisson distribution - if the numbers fit the distribution then this indicates that the cases are randomly distributed between the clusters examined adn there are no "pockets" of malnutrition - if they do not fit this distribution then there are likely to be "pockets". When there are pockets the data normally fit another type of distrubution called a negative binomial distribution (also calculated and fitted in SMART beta) - which can shows the degree of "pocketing". We find that wasting in most surveys is randomly distributed, but kwashiorkor nearly always occurs in pockets. This has important implications epidemiologically for the two conditions - but it also means that the confidence intervals around a kwash estimate are quite different from those around a wasting estimate - they are combined in SAM prevalence reporting, which is probably wrong and they should be reported separately with their separate confidence intervals. In Mark's slum and palace population this would be clearly shown up when the poisson distribution was examined (and it would not fit either distribution!). It is not the place to enter into an argument, as I need to document and argue my case in a complete papter - but, unlike the normal assumption, I do not think that wasting and ACUTE infections such as diarrhoea and pneumonia are closely related, and stunting probably not at all.
Michael Golden

Answered:

15 years ago
Mike is correct in drawing attention to the design effect. You need to make a good guess at the expected design effect if you survey is to have the desired precision (i.e. with of confidence interval around the point estimate). My experience of two-stage cluster sample surveys of wasting in many settings tallies with Mike's in that the majority of design effects lie between about 1.2 amd 1.7. In application such as eye surveys (these are usually two-stage cluster sampled surveys of large sample size estimating the prevalence of some quite rare conditions) it is not uncommon to find design effects of 5 or higher with infectious conditions such as trachoma which clusters at all sorts of scales or vector transmitted diseases such as onchocerciasis which clusters in locations favourable to the vector. This is not usually a problem since (1) such surveys also collect data on weakly clustered or non-clustered conditions (e.g. cataract, glaucoma) and (2) the clustered conditions (should any cases be found in the eye survey) are usually subjects of subsequent spatial or small-area surveys such as TRA / ASTRA for trachoma and REMO for onchocerciasis. Design effect is a little more complicated than that though. Having many small clusters will tend to reduce design effects compared to the case of having many large clusters. This might be the way to go for the questioner's problem. Instead of taking (e.g.) 30 clusters of 30 children you could take 60 clusters of 15 children. This way you will probably have a sufficient number of clusters in both zones to arrive at reasonably accurate and precise estimates of prevalence in both of them. You would need to be careful if the zones differ greatly with regard to community size as the PPS method might only select a small number of clusters in the zone with smaller communities. Let us not not get quarrelsome about infection and wasting. Many surveys collect period prevalence data and strong associations between wasting and diarrhoea and fever are quite common for both W/H and MUAC case-definitions. The problem with this data is that it is cross-sectional. The problem with investigating risk factors for stunting is that it is a process rather than an outcome and requires a cohort study approach. For stunted children, a case-control approach could be adopted but recall might be a problem.
Mark Myatt
Technical Expert

Answered:

15 years ago
Oops ... I meant to write "Having many small clusters will tend to reduce design effects compared to the case of having FEW large clusters". Sorry for the confusion.
Mark Myatt
Technical Expert

Answered:

15 years ago
Please login to post an answer:
Login