Correlation matrix

Problem

You want to visualize the strength of correlations among many variables.

Solution

Suppose this is your data:

  1. set.seed(955)
  2. vvar <- 1:20 + rnorm(20,sd=3)
  3. wvar <- 1:20 + rnorm(20,sd=5)
  4. xvar <- 20:1 + rnorm(20,sd=3)
  5. yvar <- (1:20)/2 + rnorm(20, sd=10)
  6. zvar <- rnorm(20, sd=6)
  7. # A data frame with multiple variables
  8. data <- data.frame(vvar, wvar, xvar, yvar, zvar)
  9. head(data)
  10. #> vvar wvar xvar yvar zvar
  11. #> 1 -4.252354 5.1219288 16.02193 -15.156368 -4.086904
  12. #> 2 1.702318 -1.3234340 15.83817 -24.063902 3.468423
  13. #> 3 4.323054 -2.1570874 19.85517 2.306770 -3.044931
  14. #> 4 1.780628 0.7880138 17.65079 2.564663 1.449081
  15. #> 5 11.537348 -1.3075994 10.93386 9.600835 2.761963
  16. #> 6 6.672130 2.0135190 15.24350 -3.465695 5.749642

To make the graph:

  1. library(ellipse)
  2. # Make the correlation table
  3. ctab <- cor(data)
  4. round(ctab, 2)
  5. #> vvar wvar xvar yvar zvar
  6. #> vvar 1.00 0.61 -0.85 0.75 -0.21
  7. #> wvar 0.61 1.00 -0.81 0.54 -0.31
  8. #> xvar -0.85 -0.81 1.00 -0.63 0.24
  9. #> yvar 0.75 0.54 -0.63 1.00 -0.30
  10. #> zvar -0.21 -0.31 0.24 -0.30 1.00
  11. # Make the graph, with reduced margins
  12. plotcorr(ctab, mar = c(0.1, 0.1, 0.1, 0.1))
  13. # Do the same, but with colors corresponding to value
  14. colorfun <- colorRamp(c("#CC0000","white","#3366CC"), space="Lab")
  15. plotcorr(ctab, col=rgb(colorfun((ctab+1)/2), maxColorValue=255),
  16. mar = c(0.1, 0.1, 0.1, 0.1))

plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3

Notes

For more information on generating the correlation table (with numbers), see: ../../Statistical analysis/Regression and correlation