The "constrasts" set in your R environment determine how categorical variables are handled in your models. The most common scheme in regression is called "treatment contrasts": with treatment contrasts, the first level of the categorical variable is assigned the value 0, and then other levels measure the change from the first level.
Why do we need this? Because with k categories, we really need only k-1 pieces of information to represent any of the values. If the values are "Red," "White," and "Blue," for example, we might have a column named "Red," containing 1's for Red items and 0 for non-Red items, and a column named "White," containing 1's for White items and 0 for non-White items. That's all we need -- if we see an item with 0's for both Red and White, it must be blue. So we need one constraint on our contrasts, or, to put it another way, we need k-1 columns to represent a categorical variable with k levels.
As an example, consider the solder data set in the rpart library. First, look at the levels associated with each of the categorical variables.
> library (rpart) # make available > sapply (solder, levels) # see ?solder for more info $Opening [1] "L" "M" "S" $Solder [1] "Thick" "Thin" $Mask [1] "A1.5" "A3" "A6" "B3" "B6" $PadType [1] "D4" "D6" "D7" "L4" "L6" "L7" "L8" "L9" "W4" "W9" $Panel [1] "1" "2" "3" $skips NULLThere are three levels of Opening, two of Solder, five of Mal, 10 of PadType, and three of Panel. skips, being continuous, has no levels. Notice that within each categorical the levels are sorted alphabetically. When you have treatment contrasts set -- which is the default -- a simple linear model, just using Opening and Solder, produces output like this. (Since the response variable is an integer, perhaps this model isn't 100% appropriate, but hey -- this is just an example, right?)
> lm (skips ~ Opening + Solder, data = solder) Call: lm(formula = skips ~ Opening + Solder, data = solder) Coefficients: (Intercept) OpeningM OpeningS SolderThin -1.092 2.037 9.953 5.251
The interpretation is clear here. Opening L is the baseline; you can think of it as having a coefficient of 0. Notice that Opening M has a predicted average number of "skips" that is 2.04 greater than Opening L's, and Opening S has a predicted number that is 9.95 greater than the one for Opening L. Both coefficients compare levels to the baseline, and the baseline is the first level on the list of levels.
Similarly Solder Thick is set to be the baseline, and the difference -- the contrast -- between Solder Thin and Thick is estimated as 5.25. The coefficient labels consist of the column name and the level name, pasted together; the baseline level isn't listed at all.
The intercept here is -1.092. This is the estimated average rounds when Opening = L and Solder = Thick (that is, when each variable is at its basline). Now we have enough information to get the estimated average for every combination of Physique and Method.
> options(contrasts = rep("contr.sum", 2)) # Set contrasts -- see below > contr.sum (3) # This shows what the Physique contrasts measure, by column [,1] [,2] 1 1 0 2 0 1 3 -1 -1 > contr.sum (2) # This is what the Method contrast measures. [,1] 1 1 2 -1 > lm (skips ~ Opening + Solder, data = solder) Call: lm(formula = skips ~ Opening + Solder, data = solder) Coefficients: (Intercept) Opening1 Opening2 Solder1 5.530 -3.997 -1.960 -2.626This looks different, but it really isn't -- it's just a question of scoring. All the residuals from this model are identical to those from the other. So are the predictions. For example, the first column of the contrast matrix (above) is (1, 0, -1), meaning it measures the difference between Opening L and S. That effect is estimated as -4. So if an observation has Opening L, it gets -4, and if it has Heavy, it gets +4. The estimated difference between M and S is -1.96. If an observation has Opening M, it gets -1.96, and if it has S, it gets 1.96. Both of these contrasts compare the baseline, but unlike the case treatment contrasts, the baseline is the last level here. With Solder, observations with Thick get -2.626 and those with Thin get +2.626. (Again, the signs come from the column vectors of -1's, 0's, and 1's returned by the contr.sum() function.) Thus the estimated average skips when Opening = L and Solder = Thick is 5.53 + -4 + (-2.626) = -1.096 as before. Notice that the labels for the physique effects have changed. They don't correspond to differences from 0, but to differences between a pair of levels.
> contr.poly (3) .L .Q [1,] -7.071068e-01 0.4082483 [2,] -7.850462e-17 -0.8164966 [3,] 7.071068e-01 0.4082483As you can see, the first column computes (1/sqrt(2)) * (3rd level) - (1/sqrt(2)) * 1st level. This contrast measures the rate of change of the set of coefficients; the 1/sqrt(2) part just makes sure that the vector of contrasts has length 1. If the contrasts weren't ordered this wouldn't seem like a great move. The second column measures how quadratic the effects are, by comparing the middle one to the average of #1 and #3. Again, the specfic values are chosen so that the middle is -2 * the others and the whole vector has length 1. Polynomial contrasts are orthogonal, which is good, but to me they're harder to interpret than treatment contrasts.
options(contrasts = rep ("contr.treatment", 2))Notice that we need to enter two contrast settings. The first handles unordered categorical variables, the second, ordered. The default settings are "contr.treatment" for unordered variables, and "contr.poly" for ordered ones. Notice also that we pass the contrast settings to options() as the names of functions (that is, with quotes), not as the functions themselves.
> options(contrasts = rep ("contr.treatment", 2)) # Set default contrasts > solder <- solder # Make local copy of data > attr (solder$Opening, "contrasts") <- contr.poly (3) # Attach contrasts to "Opening" by creating an > lm (skips ~ Opening + Solder, data = solder) # attribute; the 3 tells R there are 3 levels Call: lm(formula = skips ~ Opening + Solder, data = solder) Coefficients: (Intercept) Opening.L Opening.Q SolderThin 2.904 7.038 2.400 5.251Here the Opening variable is using polynomial contrasts but the Solder variable is using treatment contrasts.