Hello, could someone guide me to make a Venn diagram with R software? I would like to represent children who accumulate several anthropometric deficits (MUAC<125, WAZ<-2, HAZ<-2, WFH<-2).

I found several packages for this but I can't seem to find explanations on how to parameterize the samples directly from the database.

Thanks in advance for your guidance.

Hi Cecile, 

Here is a package called ggVennDiagram which has some walk through for it at this link: https://www.datanovia.com/en/blog/beautiful-ggplot-venn-diagram-with-r/

There is also an R users humanitarian Skype group that may be better suited for this question and troubleshooting in general if it continues to be a challenge: https://join.skype.com/iKzkeiOdnJip

Hope this helps,

Anonymous

Answered:

2 years ago

There are a number of packages supporting the production of Venn diagram (e.g. ggvenn, ggvendiagrams, vendiagram, gplots). I have been using gplots (see below). All require you to count cases for inclusion in each cell. I think this is what may be tripping you up. Here I will provide an example of code that I used for the plots similar to those presented in:

Myatt, M., Khara, T., Dolan, C., Garenne, M. & Briend, A. Improving screening for malnourished children at high risk of death: a study of children aged 6–59 months in rural Senegal. Public Health Nutr 22, 862–871 (2018).

The problem here is to plot the number of deaths associated with different anthropometric case definitions. Here is some example code ...

        ## Plotting functions (including Venn diagrams)
        library(gplots)

        ##
        ## Function to give Venn diagram counts (might be useful .. see below)
        ##
        vennCounts <- function(x, include = "both")
        {
            x <- as.matrix(x)
            include <- match.arg(include,c("both","up","down"))
            x <- sign(switch(include, both = abs(x), up = x > 0, down = x < 0))
            nprobes <- nrow(x)
            ncontrasts <- ncol(x)
            group.names <- colnames(x)
            if(is.null(group.names))
            {
                group.names <- paste("Group", 1:ncontrasts)
            }
            noutcomes <- 2^ncontrasts
            outcomes <- matrix(0, noutcomes, ncontrasts)
            colnames(outcomes) <- group.names
            for (j in 1:ncontrasts)
            {
                outcomes[,j] <- rep(0:1, times = 2^(j - 1), each=2^(ncontrasts - j))

            }    
            xlist <- list()
            for (i in 1:ncontrasts)
            {
                xlist[[i]] <- factor(x[ ,ncontrasts - i + 1],levels=c(0, 1))
            }
            counts <- as.vector(table(xlist))
            result <- structure(cbind(outcomes, Counts = counts), class = "VennCounts")
            return(result)
        }

        ## Select deaths (data are in data.frame 'x')
        x <- subset(x, outcome == 1)

        ## Truth table for Venn diagram
        vennData <- cbind(x$whz < -3, x$waz < -3 , x$muac < 115, x$whz < -2 & x$haz < -2)
        ## Label each column of the truth table
        colnames(vennData) <- c("WHZ", "WAZ", "MUAC", "WAST")

        ## Plot the Venn diagram (using try() suppresses messages)
        try(venn(vennData, intersections = TRUE, small = NULL), silent = TRUE)

        ## We can present the diagram as a table using VennCounts()
        vennCounts(vennData)

        ## We can would with thes counts. For example:
        vennDeath <- vennCounts(vennData)

        deathsNotPredicted <- vennDeath[1,ncol(vennDeath)]
        allDeaths <- sum(vennDeath[,ncol(vennDeath)])
        (deathsNotPredicted)
        (allDeaths)

If you try this you might find (1) the plot is a bit small on the page and (2) labels area bit "scrappy". you can fix (1) by setting the margins using par(mar = ...) and par(mai = ...). You can fix (2) by setting the label character expansion and by specifying other labels and adding spaces and newlines as needed.

I have code (R function using R base graphics) that will plot Venn diagrams without the need to use an external library. I used this for:

Myatt, M. et al. Children who are both wasted and stunted are also underweight and have a high risk of death: a descriptive epidemiology of multiple anthropometric deficits using data from 51 countries. Archives of Public Health 1–11 (2018) doi:10.1186/s13690-018-0277-1.

I can share this code if that would be useful.

I hope this is of help to you.

Mark Myatt
Technical Expert

Answered:

2 years ago

Thanks a lot for your answers. In the end, I opted for the ggVennDiagram package which is quite simple to use and to select my population with the dplyr package is very quick.

I am sharing the program here in case it can also be useful to other people

x = list(
ALL_INCLU2 %>% filter(MUAC<125) %>% select(pat_num) %>% unlist(),
ALL_INCLU2 %>% filter(WAZ< -2) %>% select(pat_num) %>% unlist(),
ALL_INCLU2 %>% filter(HAZ< -2) %>% select(pat_num) %>% unlist(),
ALL_INCLU2 %>% filter(WHZ< -2) %>% select(pat_num) %>% unlist())

ggVennDiagram(
x, label_alpha = 0,
category.names = c("MUAC<125", "WAZ<-2", "HAZ<-2", "WHZ<-2")
) +
ggplot2::scale_fill_gradient(low="white",high = "orange")

the graphics output is great.

Thank you very much Mark for your precise orientation and your references (on which I was also in the process of basing myself for future analyses).

Anonymous

Answered:

2 years ago
Please login to post an answer:
Login