Setting and Keeping Contrasts

Don't explain, just remind me how to set them.

What are contrasts?

The "constrasts" set in your R environment determine how categorical variables are handled in your models. The most common scheme in regression is called "treatment contrasts": with treatment contrasts, the first level of the categorical variable is assigned the value 0, and then other levels measure the change from the first level.

Why do we need this? Because with k categories, we really need only k-1 pieces of information to represent any of the values. If the values are "Red," "White," and "Blue," for example, we might have a column named "Red," containing 1's for Red items and 0 for non-Red items, and a column named "White," containing 1's for White items and 0 for non-White items. That's all we need -- if we see an item with 0's for both Red and White, it must be blue. So we need one constraint on our contrasts, or, to put it another way, we need k-1 columns to represent a categorical variable with k levels.

As an example, consider the solder data set in the rpart library. First, look at the levels associated with each of the categorical variables.

> library (rpart)         # make available
> sapply (solder, levels)    # see ?solder for more info

$Opening
[1] "L" "M" "S"

$Solder
[1] "Thick" "Thin" 

$Mask
[1] "A1.5" "A3"   "A6"   "B3"   "B6"  

$PadType
 [1] "D4" "D6" "D7" "L4" "L6" "L7" "L8" "L9" "W4" "W9"

$Panel
[1] "1" "2" "3"

$skips
NULL
 
There are three levels of Opening, two of Solder, five of Mal, 10 of PadType, and three of Panel. skips, being continuous, has no levels. Notice that within each categorical the levels are sorted alphabetically. When you have treatment contrasts set -- which is the default -- a simple linear model, just using Opening and Solder, produces output like this. (Since the response variable is an integer, perhaps this model isn't 100% appropriate, but hey -- this is just an example, right?)
> lm (skips ~ Opening + Solder, data = solder)
Call:
lm(formula = skips ~ Opening + Solder, data = solder)

Coefficients:
(Intercept)     OpeningM     OpeningS   SolderThin  
     -1.092        2.037        9.953        5.251  

The interpretation is clear here. Opening L is the baseline; you can think of it as having a coefficient of 0. Notice that Opening M has a predicted average number of "skips" that is 2.04 greater than Opening L's, and Opening S has a predicted number that is 9.95 greater than the one for Opening L. Both coefficients compare levels to the baseline, and the baseline is the first level on the list of levels.

Similarly Solder Thick is set to be the baseline, and the difference -- the contrast -- between Solder Thin and Thick is estimated as 5.25. The coefficient labels consist of the column name and the level name, pasted together; the baseline level isn't listed at all.

The intercept here is -1.092. This is the estimated average rounds when Opening = L and Solder = Thick (that is, when each variable is at its basline). Now we have enough information to get the estimated average for every combination of Physique and Method.

Contr.sum example

Another commonly used contrast set-up is sum contrasts, often used in Anova. In this set-up the coefficients for each categorical are constrained to add up to 0. Here's an example with the solder data.
> options(contrasts = rep("contr.sum", 2)) # Set contrasts -- see below
> contr.sum (3)   # This shows what the Physique contrasts measure, by column
  [,1] [,2] 
1    1    0
2    0    1
3   -1   -1
> contr.sum (2)   # This is what the Method contrast measures.
  [,1] 
1    1
2   -1

> lm (skips ~ Opening + Solder, data = solder)

Call:
lm(formula = skips ~ Opening + Solder, data = solder)

Coefficients:
(Intercept)     Opening1     Opening2      Solder1  
      5.530       -3.997       -1.960       -2.626  
This looks different, but it really isn't -- it's just a question of scoring. All the residuals from this model are identical to those from the other. So are the predictions. For example, the first column of the contrast matrix (above) is (1, 0, -1), meaning it measures the difference between Opening L and S. That effect is estimated as -4. So if an observation has Opening L, it gets -4, and if it has Heavy, it gets +4. The estimated difference between M and S is -1.96. If an observation has Opening M, it gets -1.96, and if it has S, it gets 1.96. Both of these contrasts compare the baseline, but unlike the case treatment contrasts, the baseline is the last level here. With Solder, observations with Thick get -2.626 and those with Thin get +2.626. (Again, the signs come from the column vectors of -1's, 0's, and 1's returned by the contr.sum() function.) Thus the estimated average skips when Opening = L and Solder = Thick is 5.53 + -4 + (-2.626) = -1.096 as before. Notice that the labels for the physique effects have changed. They don't correspond to differences from 0, but to differences between a pair of levels.

The Default Setting

R provides five built-in contrast functions, and you can write your own. The default for unordered variables is contr.treatment(), which is much the most frequently used. Other choices include contr.SAS(), which is like treatment contrasts only with the baseline being the last level, not the first; contr.poly(), about which more in a second, and contr.helmert(), which produces contrasts that are orthogonal but harder to interpret.

Handling Ordered Variables

Polynomial contrasts are particularly useful for handling ordered variables, that is, variables whose levels are naturally ordered. However I don't use them much myself. It's permissible to use treatment contrasts for ordered variables, though in some sense you're giving up some information (the information on the ordering). Still, here's what they might look like in our case, treating Opening as orderd:
> contr.poly (3)
                .L         .Q
[1,] -7.071068e-01  0.4082483
[2,] -7.850462e-17 -0.8164966
[3,]  7.071068e-01  0.4082483
As you can see, the first column computes (1/sqrt(2)) * (3rd level) - (1/sqrt(2)) * 1st level. This contrast measures the rate of change of the set of coefficients; the 1/sqrt(2) part just makes sure that the vector of contrasts has length 1. If the contrasts weren't ordered this wouldn't seem like a great move. The second column measures how quadratic the effects are, by comparing the middle one to the average of #1 and #3. Again, the specfic values are chosen so that the middle is -2 * the others and the whole vector has length 1. Polynomial contrasts are orthogonal, which is good, but to me they're harder to interpret than treatment contrasts.

Setting your Contrasts

Contrasts are set with the contrasts= argument to the options() command. Here's an example of a call to options() to set contrasts; this is the setting I recommend.
options(contrasts = rep ("contr.treatment", 2))
Notice that we need to enter two contrast settings. The first handles unordered categorical variables, the second, ordered. The default settings are "contr.treatment" for unordered variables, and "contr.poly" for ordered ones. Notice also that we pass the contrast settings to options() as the names of functions (that is, with quotes), not as the functions themselves.

Setting your Contrasts Permanently

The default settings are restored every time you quit your R session. However there is a mechanism by which your preferences can be re-set each time you start up. If you have a function named .First (exactly that name, with the dot and the capital "F"), R will run that function whenever you start up. So create a function named .First with the options() command above in it.

Can I Mix Contrasts, Or Require a Variable To Have A Particular Contrast Applied?

Yes, you can. If a vector (or data frame column) has a "contrasts" attribute, then that attribute will take precedence over the setting in options. So in the solder example, we might do this:
> options(contrasts = rep ("contr.treatment", 2))      # Set default contrasts
> solder <- solder                                     # Make local copy of data
> attr (solder$Opening, "contrasts") <- contr.poly (3) # Attach contrasts to "Opening" by creating an
> lm (skips ~ Opening + Solder, data = solder)         # attribute; the 3 tells R there are 3 levels

Call:
lm(formula = skips ~ Opening + Solder, data = solder)

Coefficients:
(Intercept)    Opening.L    Opening.Q   SolderThin  
      2.904        7.038        2.400        5.251  

Here the Opening variable is using polynomial contrasts but the Solder variable is using treatment contrasts.

How Do I Create My Own Contrasts?

Any function that creates a matrix of the proper size can be used to create contrasts. I must say, though, that this question hasn't come up much.

How Can I Change The Baseline Level?

For treatment contrasts, the baseline level is whichever level appears first in the output from the levels() command. By default this will be whichever level's name appears first in alphabetical order. To change that, see handling factor variables. For sum (anova) contrasts, the contrasts measure differences between each level and the last. Again this order can be changed. For Helmert and polynomial contrasts there really isn't a baseline in that sense.

And There I Have It?

And there you have it.

Return to R docs