Axes (ggplot2)

Problem

You want to change the order or direction of the axes.

Solution

Note: In the examples below, where it says something like scaley_continuous, scale_x_continuous, or ylim, the _y can be replaced with x if you want to operate on the other axis.

This is the basic boxplot that we will work with, using the built-in PlantGrowth data set.

  1. library(ggplot2)
  2. bp <- ggplot(PlantGrowth, aes(x=group, y=weight)) +
  3. geom_boxplot()
  4. bp

plot of chunk unnamed-chunk-2

Swapping X and Y axes

Swap x and y axes (make x vertical, y horizontal):

  1. bp + coord_flip()

plot of chunk unnamed-chunk-3

Discrete axis

Changing the order of items

  1. # Manually set the order of a discrete-valued axis
  2. bp + scale_x_discrete(limits=c("trt1","trt2","ctrl"))
  3. # Reverse the order of a discrete-valued axis
  4. # Get the levels of the factor
  5. flevels <- levels(PlantGrowth$group)
  6. flevels
  7. #> [1] "ctrl" "trt1" "trt2"
  8. # Reverse the order
  9. flevels <- rev(flevels)
  10. flevels
  11. #> [1] "trt2" "trt1" "ctrl"
  12. bp + scale_x_discrete(limits=flevels)
  13. # Or it can be done in one line:
  14. bp + scale_x_discrete(limits = rev(levels(PlantGrowth$group)))

plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4

Setting tick mark labels

For discrete variables, the tick mark labels are taken directly from levels of the factor. However, sometimes the factor levels have short names that aren’t suitable for presentation.

  1. bp + scale_x_discrete(breaks=c("ctrl", "trt1", "trt2"),
  2. labels=c("Control", "Treat 1", "Treat 2"))

plot of chunk unnamed-chunk-5

  1. # Hide x tick marks, labels, and grid lines
  2. bp + scale_x_discrete(breaks=NULL)
  3. # Hide all tick marks and labels (on X axis), but keep the gridlines
  4. bp + theme(axis.ticks = element_blank(), axis.text.x = element_blank())

plot of chunk unnamed-chunk-6plot of chunk unnamed-chunk-6

Continuous axis

Setting range and reversing direction of an axis

If you simply want to make sure that an axis includes a particular value in the range, use expand_limits(). This can only expand the range of an axis; it can’t shrink the range.

  1. # Make sure to include 0 in the y axis
  2. bp + expand_limits(y=0)
  3. # Make sure to include 0 and 8 in the y axis
  4. bp + expand_limits(y=c(0,8))

plot of chunk unnamed-chunk-7plot of chunk unnamed-chunk-7

You can also explicitly set the y limits. Note that if any scale_y_continuous command is used, it overrides any ylim command, and the ylim will be ignored.

  1. # Set the range of a continuous-valued axis
  2. # These are equivalent
  3. bp + ylim(0, 8)
  4. # bp + scale_y_continuous(limits=c(0, 8))

plot of chunk unnamed-chunk-8

If the y range is reduced using the method above, the data outside the range is ignored. This might be OK for a scatterplot, but it can be problematic for the box plots used here. For bar graphs, if the range does not include 0, the bars will not show at all!

To avoid this problem, you can use coord_cartesian instead. Instead of setting the limits of the data, it sets the viewing area of the data.

  1. # These two do the same thing; all data points outside the graphing range are
  2. # dropped, resulting in a misleading box plot
  3. bp + ylim(5, 7.5)
  4. #> Warning: Removed 13 rows containing non-finite values (stat_boxplot).
  5. # bp + scale_y_continuous(limits=c(5, 7.5))
  6. # Using coord_cartesian "zooms" into the area
  7. bp + coord_cartesian(ylim=c(5, 7.5))
  8. # Specify tick marks directly
  9. bp + coord_cartesian(ylim=c(5, 7.5)) +
  10. scale_y_continuous(breaks=seq(0, 10, 0.25)) # Ticks from 0-10, every .25

plot of chunk unnamed-chunk-9plot of chunk unnamed-chunk-9plot of chunk unnamed-chunk-9

Reversing the direction of an axis

  1. # Reverse order of a continuous-valued axis
  2. bp + scale_y_reverse()

plot of chunk unnamed-chunk-10

Setting and hiding tick markers

  1. # Setting the tick marks on an axis
  2. # This will show tick marks on every 0.25 from 1 to 10
  3. # The scale will show only the ones that are within range (3.50-6.25 in this case)
  4. bp + scale_y_continuous(breaks=seq(1,10,1/4))
  5. # The breaks can be spaced unevenly
  6. bp + scale_y_continuous(breaks=c(4, 4.25, 4.5, 5, 6,8))
  7. # Suppress ticks and gridlines
  8. bp + scale_y_continuous(breaks=NULL)
  9. # Hide tick marks and labels (on Y axis), but keep the gridlines
  10. bp + theme(axis.ticks = element_blank(), axis.text.y = element_blank())

plot of chunk unnamed-chunk-11plot of chunk unnamed-chunk-11plot of chunk unnamed-chunk-11plot of chunk unnamed-chunk-11

Axis transformations: log, sqrt, etc.

By default, the axes are linearly scaled. It is possible to transform the axes with log, power, roots, and so on.

There are two ways of transforming an axis. One is to use a scale transform, and the other is to use a coordinate transform. With a scale transform, the data is transformed before properties such as breaks (the tick locations) and range of the axis are decided. With a coordinate transform, the transformation happens after the breaks and scale range are decided. This results in different appearances, as shown below.

  1. # Create some noisy exponentially-distributed data
  2. set.seed(201)
  3. n <- 100
  4. dat <- data.frame(
  5. xval = (1:n+rnorm(n,sd=5))/20,
  6. yval = 2*2^((1:n+rnorm(n,sd=5))/20)
  7. )
  8. # A scatterplot with regular (linear) axis scaling
  9. sp <- ggplot(dat, aes(xval, yval)) + geom_point()
  10. sp
  11. # log2 scaling of the y axis (with visually-equal spacing)
  12. library(scales) # Need the scales package
  13. sp + scale_y_continuous(trans=log2_trans())
  14. # log2 coordinate transformation (with visually-diminishing spacing)
  15. sp + coord_trans(y="log2")

plot of chunk unnamed-chunk-12plot of chunk unnamed-chunk-12plot of chunk unnamed-chunk-12

With a scale transformation, you can also set the axis tick marks to show exponents.

  1. sp + scale_y_continuous(trans = log2_trans(),
  2. breaks = trans_breaks("log2", function(x) 2^x),
  3. labels = trans_format("log2", math_format(2^.x)))

plot of chunk unnamed-chunk-13

Many transformations are available. See ?trans_new for a full list. If the transformation you need isn’t on the list, it is possible to write your own transformation function.

A couple scale transformations have convenience functions: scale_y_log10 and scale_y_sqrt (with corresponding versions for x).

  1. set.seed(205)
  2. n <- 100
  3. dat10 <- data.frame(
  4. xval = (1:n+rnorm(n,sd=5))/20,
  5. yval = 10*10^((1:n+rnorm(n,sd=5))/20)
  6. )
  7. sp10 <- ggplot(dat10, aes(xval, yval)) + geom_point()
  8. # log10
  9. sp10 + scale_y_log10()
  10. # log10 with exponents on tick labels
  11. sp10 + scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x),
  12. labels = trans_format("log10", math_format(10^.x)))

plot of chunk unnamed-chunk-14plot of chunk unnamed-chunk-14

Fixed ratio between x and y axes

It is possible to set the scaling of the axes to an equal ratio, with one visual unit being representing the same numeric unit on both axes. It is also possible to set them to ratios other than 1:1.

  1. # Data where x ranges from 0-10, y ranges from 0-30
  2. set.seed(202)
  3. dat <- data.frame(
  4. xval = runif(40,0,10),
  5. yval = runif(40,0,30)
  6. )
  7. sp <- ggplot(dat, aes(xval, yval)) + geom_point()
  8. # Force equal scaling
  9. sp + coord_fixed()
  10. # Equal scaling, with each 1 on the x axis the same length as y on x axis
  11. sp + coord_fixed(ratio=1/3)

plot of chunk unnamed-chunk-15plot of chunk unnamed-chunk-15

Axis labels and text formatting

To set and hide the axis labels:

  1. bp + theme(axis.title.x = element_blank()) + # Remove x-axis label
  2. ylab("Weight (Kg)") # Set y-axis label
  3. # Also possible to set the axis label with the scale
  4. # Note that vertical space is still reserved for x's label
  5. bp + scale_x_discrete(name="") +
  6. scale_y_continuous(name="Weight (Kg)")

plot of chunk unnamed-chunk-16plot of chunk unnamed-chunk-16

To change the fonts, and rotate tick mark labels:

  1. # Change font options:
  2. # X-axis label: bold, red, and 20 points
  3. # X-axis tick marks: rotate 90 degrees CCW, move to the left a bit (using vjust,
  4. # since the labels are rotated), and 16 points
  5. bp + theme(axis.title.x = element_text(face="bold", colour="#990000", size=20),
  6. axis.text.x = element_text(angle=90, vjust=0.5, size=16))

plot of chunk unnamed-chunk-17

Tick mark label text formatters

You may want to display your values as percents, or dollars, or in scientific notation. To do this you can use a formatter, which is a function that changes the text:

  1. # Label formatters
  2. library(scales) # Need the scales package
  3. bp + scale_y_continuous(labels=percent) +
  4. scale_x_discrete(labels=abbreviate) # In this particular case, it has no effect

plot of chunk unnamed-chunk-18

Other useful formatters for continuous scales include comma, percent, dollar, and scientific. For discrete scales, abbreviate will remove vowels and spaces and shorten to four characters. For dates, use date_format.

Sometimes you may need to create your own formatting function. This one will display numeric minutes in HH:MM:SS format.

  1. # Self-defined formatting function for times.
  2. timeHMS_formatter <- function(x) {
  3. h <- floor(x/60)
  4. m <- floor(x %% 60)
  5. s <- round(60*(x %% 1)) # Round to nearest second
  6. lab <- sprintf('%02d:%02d:%02d', h, m, s) # Format the strings as HH:MM:SS
  7. lab <- gsub('^00:', '', lab) # Remove leading 00: if present
  8. lab <- gsub('^0', '', lab) # Remove leading 0 if present
  9. }
  10. bp + scale_y_continuous(label=timeHMS_formatter)

plot of chunk unnamed-chunk-19

Hiding gridlines

To hide all gridlines, both vertical and horizontal:

  1. # Hide all the gridlines
  2. bp + theme(panel.grid.minor=element_blank(),
  3. panel.grid.major=element_blank())
  4. # Hide just the minor gridlines
  5. bp + theme(panel.grid.minor=element_blank())

plot of chunk unnamed-chunk-20plot of chunk unnamed-chunk-20

It’s also possible to hide just the vertical or horizontal gridlines:

  1. # Hide all the vertical gridlines
  2. bp + theme(panel.grid.minor.x=element_blank(),
  3. panel.grid.major.x=element_blank())
  4. # Hide all the horizontal gridlines
  5. bp + theme(panel.grid.minor.y=element_blank(),
  6. panel.grid.major.y=element_blank())

plot of chunk unnamed-chunk-21plot of chunk unnamed-chunk-21