Setting and Keeping Contrasts

Don't explain, just remind me how to set them.

What are contrasts?

The "constrasts" set in your R environment determine how categorical variables are handled in your models. The most common scheme in regression is called "treatment contrasts": with treatment contrasts, the first level of the categorical variable is assigned the value 0, and then other levels measure the change from the first level.

Why do we need this? Because with k categories, we really need only k-1 pieces of information to represent any of the values. If the values are "Red," "White," and "Blue," for example, we might have a column named "Red," containing 1's for Red items and 0 for non-Red items, and a column named "White," containing 1's for White items and 0 for non-White items. That's all we need -- if we see an item with 0's for both Red and White, it must be blue. So we need one constraint on our contrasts, or, to put it another way, we need k-1 columns to represent a categorical variable with k levels.

As an example, consider the "Gun" data set in the MEMSS library. First, look at the levels associated with each of the categorical variables.

> sapply (Gun, levels)   # see ?Gun for more info

$rounds
NULL

$Method
[1] "M1" "M2"

$Team
[1] "T1A" "T1H" "T1S" "T2A" "T2H" "T2S" "T3A" "T3H" "T3S"

$Physique
[1] "Slight"  "Average" "Heavy"  
 
There are two levels of Method, three of Physique and nine of Team. rounds, being continuous, has no levels. When you have treatment contrasts set, a simple linear model, ignoring Team, looks like this. (Since the response variable is an integer, perhaps this model isn't 100% appropriate, but hey -- this is just an example, right?) We are using treatment contrasts here so I can going to set these explicitly. More on this in a minute.
> options (contrasts = rep("contr.treatment", 2)) # Set contrasts -- see below
> lm (rounds ~ Physique + Method, data = Gun)
Call:
lm(formula = rounds ~ Physique + Method, data = Gun)

Coefficients:
    (Intercept)  PhysiqueAverage    PhysiqueHeavy         MethodM2  
        24.3806          -0.7417          -1.6333          -8.5111  

The interpretation is clear here. Physique Small is the baseline; you can think of it as having a coefficient of 0. Physique Average has a predicted average number of "rounds" that is .74 less than Physique Small, and Physique Heavy has a predicted number that is 1.63 lower than Physique Small. Both coefficients compare levels to the baseline, and the baseline is the first level on the list of levels.

Similarly Method M1 is set to be the baseline, and the difference -- the contrast -- between Method M2 and M1 is estimated as -8.51. The coefficient labels consist of the column name and the level name, pasted together; the baseline level isn't listed as all.

The intercept here is 24.38. This is the estimated average rounds when Physique = Small and Method = M1 (that is, when each variable is at its basline). Now we have enough information to get the estimated average for every combination of Physique and Method.

Contr.sum example

Another commonly used contrast set-up is sum contrasts, often used in Anova. In this set-up the coefficients for each categorical are constrained to add up to 0. Here's an example with the gun data.
> options(contrasts = rep("contr.sum", 2)) # Set contrasts -- see below
> contr.sum (3)   # This shows what the Physique contrasts measure, by column
  [,1] [,2] 
1    1    0
2    0    1
3   -1   -1
> contr.sum (2)   # This is what the Method contrast measures.
  [,1] 
1    1
2   -1

> lm (rounds ~ Physique + Method, data = gun)
Call:
lm(formula = rounds ~ Physique + Method, data = Gun)

Coefficients:
(Intercept)    Physique1    Physique2      Method1  
    19.3333       0.7917       0.0500       4.2556  
This looks different, but it really isn't -- it's just a question of scoring. All the residuals from this model are identical to those from the other. So are the predictions. For example, the first column of the contrast matrix (above) is (1, 0, -1), meaning it measures the difference between physiques Small and Heavy. That effect is estimated as .791667. So if an observation has Physique Small, it gets +.791667, and if it has Heavy, it gets -.791667. The estimated difference between A and H is .05. If an observation has Physique Average, it gets +.05, and if it has Heavy, it gets -.05. Both of these contrasts compare the baseline, but unlike the case treatment contrasts, the baseline is the last level here. With Method, observations with M1 get +4.2256 and those with M2 get -4.2556. (Again, the signs come from the column vectors of -1's, 0's, and 1's returned by the contr.sum() function.) Thus the estimated average rounds when Physique = S and Method = M1 is 19.33 + 0.7917 + 4.256 = 24.38 as before. Notice that the labels for the physique effects have changed. They don't correspond to differences from 0, but to differences between a pair of levels.

The Default Setting

R provides five built-in contrast functions, and you can write your own. The default for unordered variables is contr.treatment(), which is much the most frequently used. Other choices include contr.SAS(), which is like treatment contrasts only with the baseline being the last level, not the first; contr.poly(), about which more in a second, and contr.helmert(), which produces contrasts that are orthogonal but harder to interpret.

Handling Ordered Variables

Polynomial contrasts are particularly useful for handling ordered variables, that is, variables whose levels are naturally ordered (like Gun$Physique). However I don't use them much myself. It's permissible to use treatment contrasts for ordered variables, though in some sense you're giving up some information (the information on the ordering). Still, here's what they look like in our case:
> contr.poly (3)
                .L         .Q
[1,] -7.071068e-01  0.4082483
[2,] -7.850462e-17 -0.8164966
[3,]  7.071068e-01  0.4082483
As you can see, the first column computes (1/sqrt(2)) * (3rd level) - (1/sqrt(2)) * 1st level. This contrast measures the rate of change of the set of coefficients; the 1/sqrt(2) part just makes sure that the vector of contrasts has length 1. If the contrasts weren't ordered this wouldn't seem like a great move. The second column measures how quadratic the effects are, by comparing the middle one to the average of #1 and #3. Again, the specfic values are chosen so that the middle is -2 * the others and the whole vector has length 1. Polynomial contrasts are orthogonal, which is good, but to me they're harder to interpret than treatment contrasts.

Setting your Contrasts

Contrasts are set with the contrasts= argument to the options() command. From the GUI, you can set them for a particular analysis in the "Options" tab of the Anova dialog, but it's not clear how to enforce that in an lm model, nor how to make them persistent. Here's an example of a call to options() to set contrasts; this is the setting I recommend.
options(contrasts = rep ("contr.treatment", 2))
Notice that we need to enter two contrast settings. The first handles unordered categorical variables, the second, ordered. The default settings are "contr.treatment" for unordered variables, and "contr.poly" for ordered ones. Notice also that we pass the contrast settings to options() as the names of functions (that is, with quotes), not as the functions themselves.

Setting your Contrasts Permanently

The default settings are restored every time you quit your R session. However there is a mechanism by which your preferences can be re-set each time you start up. If you have a function named .First (exactly that name, with the dot and the capital "F"), R will run that function whenever you start up. So create a function named .First with the options() command above in it.

Can I Mix Contrasts, Or Require a Variable To Have A Particular Contrast Applied?

Yes, you can. If a vector (or data frame column) has a "contrasts" attribute, then that attribute will take precedence over the setting in options. So in the gun example, we might do this:
> options(contrasts = rep ("contr.treatment", 2))      # Set default contrasts
> Gun <- Gun                                           # Make local copy of data
> attr (Gun$Physique, "contrasts") <- contr.poly (3)   # Attach contrasts to "Physique" by creating an
> lm (rounds ~ Physique + Method, data = Gun)          # attribute; the 3 tells S-Plus there are 3 levels
Call:
lm(formula = rounds ~ Physique + Method, data = Gun)

Coefficients:
(Intercept)   Physique.L   Physique.Q     MethodM2  
   23.58889     -1.15494     -0.06124     -8.51111  
Here the "Physique" variable is using polynomial contrasts but the "Method" variable is using treatment contrasts.

How Do I Create My Own Contrasts?

Any function that creates a matrix of the proper size can be used to create contrasts. I must say, though, that this question hasn't come up much.

How Can I Change The Baseline Level?

For treatment contrasts, the baseline level is whichever level appears first in the output from the levels() command. By default this will be whichever level's name appears first in alphabetical order. To change that, see handling factor variables. For sum (anova) contrasts, the contrasts measure differences between each level and the last. Again this order can be changed. For Helmert and polynomial contrasts there really isn't a baseline in that sense.

And There I Have It?

And there you have it.

Return to S-Plus docs