Hi Everyone, I am working on cleaning muac measurments and seeing some improbable high and low cm measurments. Does anyone have a research paper or text which specifies at what cm points the data should be considered outliers? 

Thanks in advance!!

Hi Emily, 

If you have age, your best bet would be to calculate MUAC-for-age Z-scores (MUACAZ) for each individual. WHO defines biologically implausible MUACAZs as any that are >5 or <-5. 

You can calculate these using the WHO iGrowup package available on Stata, SAS, or R. If you Google "who igrowup macro" you will find what you are looking for. 

I hope this helps.

—Trenton

Dr. Trenton Dailey-Chwalibóg

Answered:

4 years ago

Hi Emily,

While I don't know of any paper; the following springs to mind: check the MUAC strip mm range and use that as a first indication of what's considered probable (keep in mind strip length for kids vs adults). Not sure where you conducted the survey and what challenges the population is facing but try to not underestimate the presence of the double burden where you could find both very low and extremely high values. Also not sure whether your survey is about kids or adults, I think there are different types of tapes for adults and some might actually be too short (I experienced that in an urban poor setting). If you have confirmed measurements with extreme values ('I saw it, we cross-checked several times by different enumerators"...), I'd also carefully look at them and not really exclude them as outliers (unless you have a special analytical reason). We should not try to force the data set to have a normal or whatever desired distribution but have a dataset with correct values and adjust our analysis methods & interpretation accordingly...

Are you talking about the actual MUAC value or extremes of MUAC-related indicators like MUAC-for-age? For the latter you can work with Z-score ranges.

Good luck & all the best,
Gudrun

Gudrun Stallkamp

Answered:

4 years ago

I think you could use a method like that used in SMART/ENA which uses the collected data to define outliers. SMART uses the mean and standard deviation of anthropometric indices calculated from collected data to decide which observations are more likely to be outliers than true values. In this scheme, we exclude (flag) that lie outside of +/- 3 z-scores from the mean z-score so that values of (e.g.) WHZ that are below:

    mean(WHZ) - 3

and:

    means(WHZ) + 3

are flagged as as implausible values.

I think this approach may be problematic as it assumes that SD(WHZ) = 1 and it assumes the distribution of WHZ is normal. Work done by members of the SMART team shows that these assumptions are reasonable with WHZ usually normally distributed (with only slight deviations from normality) and SD(WHZ) tending to be limited to between 0.8 and 1.2 z-scores and centered around 1 z-score.

This does not, however, help us much with your problem as we tend to use raw measurment values rather than z-scores with MUAC. An analogous approach woud be to flag MUAC values that lie outside of mean(MUAC) +/- 3 * SD(MUAC). The problem here is that both the mean and the SD are strongly influenced by outlier values. It might be better, therefore, to use more robust summary measures such as the median (as an alternative to using the mean) and the median absolute deviation (multiplied by 1.4826) as an alternative to using the SD).

A common alternative is to use the method used with boxplots. The "box" on boxplots ranges between the upper and lower quartiles of the data.
The “whiskers” on boxplots usually extend to 1.5 times the interquartile range from the ends of the box (i.e. the lower and upper quartiles). This is known as the "inner fence".  Data points that are outside the inner fence are considered to be mild outliers. The "1.5" can be increased to 3.0. This defines the "outer fence" which can be used to identify extreme outliers. I tend to think of mild outliers as possible outliers and severe outliers as probable outliers.

See these sections:

Box Plot 

What are outliers in the data? 

Detection of Outliers

of the NIST/SEMATECH Handbook of Statistical Methods. Links on theses pages lead to tests that can help identify outliers.

You could use the SMART method with MUAC/A z-scores.

I hope this is of some use.

Mark Myatt
Technical Expert

Answered:

4 years ago

Hi Emily,

No argument with the previous posts on data cleaning your MUAC measurements. From a practical point of view you need to consider the context too. In extemely malnourished children (e.g. in some famine situations) with very low MUAC and very little body fat it becomes very difficult to measure MUAC without folding the skin, thus affecting accuracy.

In a normal context, accurately measured MUACs in children 6-59 months old < 9cm should be very infrequent (and tend toward poor outcomes). Accurately measured MUACs less than 8cm are rare. Most examples of MUACs between 7-8cm that I have personally verified in the field were false readings (MUAC tape too tight). I have measured a child with a MUAC of about 7.5cm+ in an emergency situation but it was difficult to get an accurate reading because of the skin folding. 

If this was survey data and the enumerators were standardised take a look at the data for the standardisation exercise and see if the accuracy was within the stated parameters. If this is raw field data you might consider the context and the reliability of the source of measurement and make a judgement call. 

Paul Binns
Technical Expert

Answered:

4 years ago

Hi Everyone, 


Thank you so much for your quick follow up! In my paper we are reporting on MUAC z-scores which were easy to use WHO cut-offs to clean and using cm ranges. 

I did find a similar thread from Severine https://www.en-net.org/question/233.aspx also looking for appropriate cutoffs/ extreme values. I believe she ended up using 8.5 and 20 cm? I am not working with data from an emergency context, so I do not expect to see many severely or even moderately malnourished children in this population. What would be the upper end of expected cm values for children 6 mos- 5 years? This population may have some overweight children but not sure how high is not probable and should be removed from analysis. 

Additionally, does anyone outlier/ cutoff criteria they've used for  MUAC data from 5-18 year olds?  We are also considering an analysis for older children. 

Thank you so much!!

Emily DeLacey

Answered:

4 years ago

Hi Emily

In our experience cutoffs are more straightforward for surveys of a relatively normal population. For an expected abnormal population  we use the same method suggested by Mark Myatt of calculating standard deviation (sd) within our sample population to look for outliers. For MUAC there is some change with age so the ideal is to use MUACz scorea and calculate the mean and sd for those values. For undernourished infants and children many of the low values are true on rechecking so we generally apply a more liberal cutoff on the low side e.g. -5. We apply the same method to changes between two time points in longditudinal studies.

Please let me know if you need any further information.

Cheers

Jay

Jay Berkley
Technical Expert

Answered:

4 years ago

I think Jay makes a good point. The flagging rules for SMART and the WHO flagging rules used in DHS / MICS are suited to survey data. Thay are not suited to data from (e.g.) CMAM program admissions as these will likely include a great many extreme low values ... even below WHO biologically plausible values ... as we are treating children with very high risk of near term death.

Mark Myatt
Technical Expert

Answered:

4 years ago
Please login to post an answer:
Login