Name Digit preference

I'm just back from Ethiopia and speaking with some key people in the relevant government departments noticed a great emphasis being place on posterior (i.e. after survey) checks on data quality. The checks being done (or evaluated) at central government level and being used as one deciding factor as to whether the survey results may be published and / or used as a basis for planning and action. While I welcome initiatives on the quality of data collected in assessments I feel that the emphasis on posterior checks is a little misplaced since it does not seem to be accompanied by similar attention paid to definition of populations, validity of sampling frames, &c. On issue that seemed to be given particular emphasis is "digit preference" in the recording of height and length measurements. This seems to have come about because it is now included in the SMART software rather than because it is an overwhelmingly important quality checking method. I may be wrong on this. Anyway ... here comes the question(s) ... I would have thought that unless digit preference is systematic in the sense that all measurements are rounded up or down to whole centimetre or the half-centimetre then no bias will be introduced in the prevalence estimate. I understand that systematic rounding may be encouraged by the design of height boards. Is there any evidence that this is the case? Is there any evidence of introduced bias? Is W/H really that sensitive that 2 mm here of there will radically alter results?

First, of course planning to ensure sampling frames, populations etc, an execution of the plan are critical. A posteriori testing does not change this. However, prior knowledge that the survey results will be examined after collection and analysis has made a major difference to the way training and surveys are supervised and conducted. Digit preference is only one of several data quality checks that are automated in SMART. Second, you have made a common erroneous assumption - that random error makes not generate a difference to the result because those that move up balance those that move down - indeed the mean is not changed - BUT it does make a difference to the observed prevalence. Totally random error in making measurements affects the "spread of results", increases the SD and inflates the number in the tails (and hence below a cut off point near - it is worse the further the cut off point is into the tail). This is because with a random error subjects can move up or down the distribution (assume equal numbers move up and down). The number that move from the tail to a more central part of the distrubution are far fewer than those that move from a central position into the tail (proportional to the area under the curve at any point). I tested this by imposing a random error on height with a mean of 0cm and an SD of 1cm (the imposed error was normally distrubuted). Similarly for weight mean error imposed was 0g with SD of 100g (the precision of many scales). The prevalence of SAM changed from 2 to 3% - an increase of 50%! When other typical errors (recording etc) were added the prevalence went up to 6% (after cleaning with EPI/WHO flags). Digit preference was the main reason for the famous Pickering-Plat argument about hypertension in the 1960s. The WHO MONICA study of blood pressure devised methods for assessing digit preference and its effects on the outcome, and we have used the same statistical methods in SMART to assess the severity of digit preference. The same problems will occur with MUAC, Hb and any other measurement that is taken and the proportion of the population that is in the tail assessed/reported.

Anonymous

Answered:

16 years ago

Thanks for the explaination. It seems to me, however, that you modelled something different from rounding error (e.g. .1; .2 -> .0, .3; .4; 6.;.7 -> .5, and .8;.9 -> 0) but maybe I misunderstand. My first thought was ... is W/H really that sensitive to small errors? If so, then we have a problem.

Anonymous

Answered:

16 years ago

Accurate weight, height, and age information is very important for anthropometric measurements. The effect of rounding can result in a biased estimate (if rounding tends to be in one direction, such as rounding down) and/or poor precision (rounding up occurs as frequently as rounding down and to the same magnitude). On the latter, this can clearly be seen for weight-for-age z-score distributions in populations where age is not known with accuracy - the z-score distribution will be appeared to be squashed down. As noted by Michael Golden, this has the effect of increasing the proportion classified as having a z-score below -2 SD. The latter can be somewhat assessed by the standard deviation of the z-score in the study population - in the reference population the standard deviation is 1.0. If there are inaccuracies in the data collection, the standard deviation in the study population would be larger than 1.0.

Anonymous

Answered:

16 years ago

An input from Fiona Watson: The correspondence relating to "digit preference" in relation to nutrition surveys in Ethiopia is of interest to scientists involved in developing survey methodology. However, it should not overshadow the very real practical problems facing those involved in nutrition assessment in Ethiopia. Having been in Ethiopia working with the government Nutrition Coordinating Unit (the ENCU) over the last few months and having had the opportunity to interview most of the agencies involved in nutrition assessment, the following issues were highlighted. Too much data is collected (particularly on 'contextual' indicators such as food security, coping strategies, access to health, watsan, caring practices etc.), most of which is not used. Interpretation of survey results is problematic. The thresholds used to define the severity of a problem are not specific enough (aggravating factors apply probably to most of Ethiopia) and anyway do not always relate to recommendations made. Furthermore, the lack of baseline or trend data for nutritional status as well as 'contextual' indicators make interpretation difficult. Some donors require that surveys are carried out but the results do not really impact on programming and yet it takes up staff time and is costly. Just to note, over 500 were carried out in Ethiopia in the last 5 years, each costing around $15,000 (total cost therefore in excess of $7.5 million) and taking around 1 month to complete. The process of doing surveys is time-consuming and standard nutrition surveys are not a useful tool for rapid emergency assessment. However, rapid emergency assessments sampling the worst off kebeles (districts) and using MUAC can result in non-generalisable findings that do not help programming. There is frustration among many agency and government staff that surveys provide "a fog of information" that isn't helping to determine response or address malnutrition. Some of the priority issues for practitioners are: 1. Can nutrition surveys be 'streamlined' so that data essential for decision-making can be collected accurately but quickly? 2. Do other forms of nutrition assessment besides surveys need to be introduced (e.g. nutrition surveillance and/or effective and accurate rapid nutrition assessment methods) so that surveys are not repeated too often? 3. How can we develop baselines and/or trend data sets (not just anthropometric but for other 'contextual' indicators which can be compared with survey findings to aid interpretation? 4. Are there workable threshold-for-response frameworks that can be used/developed to help regional staff with limited capacity to lobby for interventions? In considering these broader operational issues, it is important to ensure a balance is struck between scientific rigour and practical application if we are to best serve the users of nutrition survey information.

Tamsin Walters

Forum Moderator

Answered:

16 years ago

These are interesting issues that you raise. I'll give some initial thoughts ... (1) There is a possibility for streamlining surveys. The survey designs and case-defintions that we use at the moment (i.e. two-stage cluster sampled surveys and W/H based case-definitions) are not the only options. In the Biafran Emergency (e.g.) spatial sampling and MUAC/H were used and this proved rapid, cheap, and useful. (2) Nutrition surveillance has a checkered history. It has been attempted in many settings but has not been sustained. Most systems suffered from design problems (e.g. use of ageing cohorts, wrong indicator, lack of linkages to market data, &c.) and were expensive. With a few that I have seen they seemed to have been treated as ends in themselves with data-collection and analysis being done but seldom deciding action. It may be possible to create a vialable surveillance system nested within a GMP program. I am at present sketching out a note on a nutritional surveillance system using repeated small sample surveys and would be very interested in ideas, experiences, requirements, &c. (3) These are usually available and, even if not, most are pretty easy to collect (e.g. trade indicators). Mostly they are not widely used by INGOs. (4) Thresholds are the subject of a separate "thread" on this forum.

Anonymous

Answered:

16 years ago

Helen Keller International has been pilot testing an urban based, nutritional surveillance program in Conakry, Guinea using 200 sentinel sites, funded by the Office of Foreign Disaster Assistance (US Government). Although there are always questions regarding data quality and use, it does have the advantage of being cheap (200 USD per month) and fast (measuring changes and producing reports monthly), giving an indication of changes (up or down) in the overall food security status of very vulnerable populations. The idea is that results from this system could be used to indicate where and when SMART surveys are required. If you'd like more information or an example of the results or methodology, we'd be glad to share it. Cheers - Jen Peterson

Tamsin Walters

Forum Moderator

Answered:

16 years ago

I'd like to have more info on the surveillance system. You can contact me through the forum moderator

Anonymous

Answered:

16 years ago

Dear Jen Peterson, Regarding the HKI nutritional surveillance programme in Conakry could you share information on the methodology and data collected on a monthly basis? Did I understand correctly: 200 sites for 200USD/month? Thanks a lot in advance. Chris From Christoph Andert, Tanzania

Tamsin Walters

Forum Moderator

Answered:

15 years ago

Dear all For those interested in the HKI urban-based nutrition and food security surveillance system, Jen Peterson has posted the full documentation, which can be accessed via the link below: [url]http://www.ennonline.net/resources/view.aspx?resid=695[/url] HKI is working on a refined, finalized methodology so this is work in progress. Most of the documents are in French. Of note: "we anticipated using four sources of data to inform the overall food security system, but we were only directly responsible for collecting the household level data (which is presented here, in the tools and the power point presentation). We only had 9 months or so to get this up and running, as our funding was limited to 12 months. However, I understand funding was extended, and hopefully the additional sources of data (monthly prices, SENAH qualitative information, and the information from the SNIG) will eventually become available, to develop a fuller food security picture." If you'd like more information, or more in English, please don't hesitate to contact Jen and/or Dr. Lanfia Touré, HKI/Guinea, Dr. Mary Hodges or Shawn Baker who may be able to provide an update since Jen no longer works at HKI. Please contact me via post@en-net.org.uk if you would like their email addresses. Many thanks Tamsin

Tamsin Walters

Forum Moderator

Answered:

15 years ago

A query from a client about digit preference in some MUAC screening data got me thinking about the digit preference issue a bit more.

I got to thinking that the PROBIT estimator that is used in RAM, RAM-OP, S3M, and the LP surveillance system might be more resistant to digit preference than the classical estimator that is used in SMART, MICS, DHS, and the rest.

I decided to test this with a simple R language simulation which I repeat here.

Here we generate a population:
pop <- rnorm(n = 1000, mean = 140, sd = 11) summary(pop)
The summary is:
Min. 1st Qu. Median Mean 3rd Qu. Max. 104.0 131.7 139.2 139.4 146.6 173.8
See the figure at the end of this post.

GAM is:
table(pop < 125)
which gives:
FALSE TRUE 902 98
which is 9.8%.

SAM is:
table(pop < 115)
which gives:
FALSE TRUE 985 15
which is 1.5%.

MAM is:
9.8% - 1.5% = 8.3%

These we consider to be the true prevalences.

If we "pollute" this data by forcing all data to end with “0” or “5”
popDP <- round(pop / 5) * 5

The first 20 values in popDP:
popDP[1:15]
shows:
135 145 160 125 150 135 150 145 120 155 155 145 135 135 135
This is an extreme digit preference. The digite preference score (from SMART) for this data is 66.68. We'd probably dismiss this data as junk.

The "pollution" does not change the summary statistics much:
summary(popDP)
gives:
Min. 1st Qu. Median Mean 3rd Qu. Max. 105.0 130.0 140.0 139.4 145.0 175.0
Also see the figure at the end of this post.

GAM is now:
table(popDP < 125)
which gives:
FALSE TRUE 935 65
which is 6.5% (down from the true 9.8%).

SAM is:
table(popDP < 115)
which gives:
FALSE TRUE 991 9
which is 0.9% (down from the true 1.5%).

MAM is:
6.5% - 0.9% = 5.6%
down from the true 8.3%.

Extreme digit preference in this simulation causes large underestimates. Not good.

If we use PROBIT estimators … for GAM:
pnorm(125, mean = mean(popDP), sd = sd(popDP)) * 100
we get:
10.01426
That is 10.0% (close to the true 9.8%)

For SAM:
pnorm(115, mean = mean(popDP), sd = sd(popDP)) * 100
we get:
1.499806
that is 1.5% (exactly the same as the true 1.5%).

MAM is:
MAM = 10.0% - 1.5% = 8.5% (close to the true 8.3%)
This is a simple model which assumes normality and rounding to nearest 0 or 5. In real-life data we might have some deviation from normality. In that case we can transform the data towards normality and then apply the PROBIT estimator.

PROBIT appears to be robust to problems of digit preference. It seems to me that, if true, this is an important finding.

Here are the true (pop) and "polluted" (popDP) distributions:

Anonymous

Answered:

8 years ago