6.10 改变因子水平次序
6.10.1 问题
你想要改变因子水平出现的次序。
6.10.2 方案
R 中有两种不同类型的因子变量:有序和无序。比如{小,中,大}和{钢笔,橡皮擦,铅笔}。对于绝大多数分析而言,一个因子变量是有序还是无序并不重要。如果因子是有序的,那么这个因子水平的特定次序是重要的(小 < 中 < 大)。如果因子是无序的,那么因子水平同样会以一定的顺序出现,但这仅仅为了方便而已(钢笔,橡皮擦,铅笔) - 但有时它是重要的,例如它会指导结果如何输出,图形元素如何展示。
一种改变因子次序的方式是对因子使用 factor()
函数并且直接指定它们的次序。下面这个例子中,ordered()
函数可以替换 factor()
函数。
下面是这个例子的数据:
# 创建一个错误次序的因子
sizes <- factor(c("small", "large", "large", "small", "medium"))
sizes
#> [1] small large large small medium
#> Levels: large medium small
因子水平被显式地指定:
sizes <- factor(sizes, levels = c("small", "medium", "large"))
sizes
#> [1] small large large small medium
#> Levels: small medium large
我们同样可以对有序因子这样操作:
sizes <- ordered(c("small", "large", "large", "small", "medium"))
sizes <- ordered(sizes, levels = c("small", "medium", "large"))
sizes
#> [1] small large large small medium
#> Levels: small < medium < large
另一种方式是使用 relevel()
函数在列表中制作一个特定水平(这对有序因子不起作用)。
# 创建错误次序的因子
sizes <- factor(c("small", "large", "large", "small", "medium"))
sizes
#> [1] small large large small medium
#> Levels: large medium small
# 使得 medium 排最前面
sizes <- relevel(sizes, "medium")
sizes
#> [1] small large large small medium
#> Levels: medium large small
# 使得 small 排最前面
sizes <- relevel(sizes, "small")
sizes
#> [1] small large large small medium
#> Levels: small medium large
当因子创建时,我们可以指定合适的顺序。
sizes <- factor(c("small", "large", "large", "small", "medium"),
levels = c("small", "medium", "large"))
sizes
#> [1] small large large small medium
#> Levels: small medium large
反转因子水平次序。
# 创建错误次序的因子
sizes <- factor(c("small", "large", "large", "small", "medium"))
sizes
#> [1] small large large small medium
#> Levels: large medium small
sizes <- factor(sizes, levels = rev(levels(sizes)))
sizes
#> [1] small large large small medium
#> Levels: small medium large