> seq (1, 10, by = 1) [1] 1 2 3 4 5 6 7 8 9 10or, since these are all integers, you can just use the (somehwat less powerful) colon operator.
> x <- 1:10If you operate on that vector with, say, the sin() function, you get a vector of 10 sines:
> sin(x) [1] 0.8414710 0.9092974 0.1411200 -0.7568025 -0.9589243 [6] -0.2794155 0.6569866 0.9893582 0.4121185 -0.5440211(Of course, the [1] and [6] indicate that the leftmost values are, respectively, the first and sixth in the set of ten.) Here's an example of a logical vector: the question "is sin(x) bigger than 0?"
> sin(x) > 0 [1] TRUE TRUE TRUE FALSE FALSE FALSE TRUE TRUE TRUE FALSEThe answer, of course depends on the value of x. The first three values of x are smaller than pi, so their sines are positive; the next three are between pi and 2 pi, so their sines negative, and so on. Again the logical operator ">" has operated on a vector and returned a vector.
What if I want to count the number of x's for which sin(x) is > 0?
> sum (sin(x) > 0) [1] 6The sum() function converts the logical T's and F's to 1's and 0's, respectively; then adding those up amounts to counting the T's.
Subscripting
R has several ways to subscript (that is, extract specific elements from a vector). The most common way is directly, using the square bracket operator:
> x[4] [1] 4In this example, the user has said "give me the fourth element," and R has said, "you get a vector whose first (and only) element is 4."
Here's a similar question: "what are the second and fifth elements of x?"
> x[c(2, 5)] [1] 2 5Here the c(), of course, constructs the vector (2,5) to be used as the index; then we extract the second and fifth entries of x.
Logical subscripting
We can use a logical vector, of the same length as your data, as an index and R will pull out the elements of the data vector for which the corresponding indices are TRUE. For example, consider a new x vector consisting of 2, 4, 6, 8, and 10. Let's use a logical subscript to extract the second and fifth entries.
> x <- c(2,4,6,8,10) > x[c(F, T, F, F, T)] [1] 4 10This says "give me the second and fifth elements of x, not the first, third or fourth." Which of these x's have sines that are positive?
> sin(x) > 0 [1] T F F T FWhat are the x values whose sines are positive?
x[sin(x) > 0] [1] 2 8Here the sin(x) > 0 on the inside of the square brackets produces a logical vector of length 5 with two T's; then those two T's, in the first and fourth position, give us the first and fourth elements of x.
Here's a similar question: "what the are indices of the elements of x for which the sine is > 0?"
>(1:length(x))[sin(x) > 0] [1] 1 4In this example, the (1:length(x)) produces the vector (1,2,3,4,5); then from that vector we extract the first and fourth items (since sin(x) > 0 for the first and fourth elements of x).
Negative subscripting
Another handy thing is a negative subscript; this says "give me everything except these values." For example:
>x[-c(2,5)] # Give me everything except numbers 2 and 5 [1] 2 6 8 # We could also have used x[c(-2, -5)]You can't mix positive and negative subscripts. A "0" subscript returns nothing, which is helpful in one specific example.
More on vector operations
Just as the sin() function operates on each element in a vector, so do many other R functions. In fact functions that operate on a vector and produce a scalar are fairly unusual. Important examples include sum(), which gives the sum of a vector; mean(); median(); var() and sd(), which give the sample variance and standard deviation; prod(), which gives the product; and length(), which tells you how many elements the vector has.
Most functions operate element-wise. These include the simple arithmetic operators. For example, remember that x is c(2,4,6,8,10):
>x + 2 [1] 4 6 8 10 12 > x^2 [1] 4 16 36 64 100 > 2^x [1] 4 16 64 256 1024(Notice the different formatting. This is just R's way of making sure that all the elements in a vector can be represented by character strings of the same length. Don't worry about it.)
Let's create another vector called, say, y:
> y <- c(10, 20, 30, 40, 50) # or y <- 10 * 1:5 > x + y [1] 12 24 36 48 60 # Each x is added to the corresponding y > x * y [1] 20 80 180 320 500 # Each x multiplied by the corresponding yWhat if the lengths don't match up? Then R recycles elements from the shorter one. You get a warning message if the length of the smaller one doesn't divide the length of the longer. For example:
> x[1:4] + y[1:2] [1] 12 24 16 28 # 2 + 10, 4 + 20, 6 +10, 8 + 20; no warning > x + y[1:2] [1] 12 24 16 28 20 # last one is 10 (x[5]) + 10 (y[1]) Warning message: In x + y[1:2] : longer object length is not a multiple of shorter object lengthThis vectorization gives R much of its power. You will rarely need to write an explicit loop in R. One major exception is when the i-th element of a vector depends explicitly on the (i-1)th, as in some simulations.