Functions in R

Writing Functions in R

All of the work in R is done by functions, down to the lowest level. For example, there's a function named "+" that does addition. There are over 2000 or so functions built in to R, many of which never get called by the user directly but serve to help out other functions.

Editing functions in Windows

The default editor in R for Windows is the Notepad and the major command is fix(). When you enter fix() and give the name of an existing function, R shows you the code for that function in a Notepad window and you can type whatever you like. (If the function didn't exist before, you'll get an empty function template.) Then type away. In particular you can call other, outside, functions. The syntax for a function is exactly the same as the syntax at the command line. If you need to put two statements on a line (I don't really recommend this), separate them with a semi-colon. Enclose groups of statements in braces, like this:

for (i in 1:10) {        # do 10 times: note open brace
    res <- my.func (i)   # Call my.func, pass the value of i on this loop
    cat ("i is ", i, " res is", res, "\n") # print to screen
    if (i == 2) {
        do.special.thing ()
        do.another.thing ()
    }                    # end "if"
} # end "for"

Always try to do things with vectors. Loops are expensive in S-PLus.

You can use a different editor if you like. Specify your editor with options(editor=).

Exiting NotePad

When you leave NotePad, say "Yes" to the question "Do you want to save changes?" (unless you want to discard your changes). Don't "Save As...", just "Save"; R will update your function automatically. Suppose your function is called b, and suppose there's an error in it when you quit the editor. You'll get an message that might like something this:

> fix (b)
Error in edit(name, file, title, editor) : 
  unexpected '}' occurred on line 4
 use a command like
 x <- edit()
 to recover

If you get this, enter b <- edit (), just as the message suggests. The error messages are not always the most illuminating but the reason for an error is almost always misplaced parentheses, brackets, braces, or commas.

Writing Functions

When you write an R function there are four things you should keep in mind: the arguments, the code, the side effects, and the return value.

Arguments

The arguments (or parameters) are the pieces of information you pass to the function. They can be of different sorts (lists, numeric vectors, data frames, and so on). Specify them between the parentheses at the top of the function:

function (p1, p2 = 5, p3 = "My String")
{ # function body begins here

This function takes three arguments. Default values are specified for the second and third; if values aren't passed, the values 5 and "My String" will be used for p2 and p3, respectively. There's no default for p1; if no value is passed then the function will fail as soon as a reference is made to p1. In your function you can check to see whether a argument was passed with the missing() function.

Two things to remember about arguments:

(i) p1, p2, and p3 are "local" variables. They live only inside the function and they "hide" any p1, p2 or p3 living outside the function. (You can get at the hidden ones with the get() function, if you really have to, but this will rarely be useful).

(ii) Arguments are passed "by value." That means that no matter what you do to p1, p2, and p3 inside the function, the original items are unaffected. For example:

> a <- 1 # Create an object named "a"
#
# This function takes a value and makes it into a 2.
#
> change <- function (x) { x <- 2 }
> change(a) # run the function, passing it a
> a
[1] 1      # a itself is unchanged.

Argument matching

Arguments are "matched" (that is, incoming parameters are assigned to the local variables in your list of arguments) in two ways, first by name and then by position. So consider the function "print.args()":

> print.args <- function (p1, p2, p3) {
cat ("p1 is", p1, ", p2 is", p2, ", p3 is", p3, "\n") }

Technically no braces are required for a one-line function but I always put them in. When you call this function, R matches up the arguments you pass by name first, then it fills the rest by position. For example:

> print.args(p1 = 1, p2 = 2, p3 = 3) # Give them by name
p1 is 1 , p2 is 2 , p3 is 3
> print.args(p3 = 1, p1 = 2, p2 = 3) # Order not important
p1 is 2 , p2 is 3 , p3 is 1
> print.args(3, 1, 2)                # Give them by position
p1 is 3 , p2 is 1 , p3 is 2
> print.args(3, 1, p1 = 2)           # p1 by name, others by position
p1 is 2 , p2 is 3 , p3 is 1
> print.args(1, 2, 3, 4)             # Too many!
Error in print.args(1, 2, 3, 4) : unused argument(s) (4)

Using "..."

The special construct "..." can be used as an argument to signify "anything else." You might, for example, take the parameters passed in this way and hand them off to some function you yourself are calling. This is not something you'll need very often. If you need to figure out what's in the list, assign list(...) somewhere and check that list out.

Side Effects

By "side effects" we mean anything that the function does besides producing a return value. You function can create or remove objects, but I don't recommend that. The major acceptable side effect is producing graphics.

Return value

Some functions print stuff out, but most return something. In R a function has exactly one return value. If you need to return more than one thing you'll need to make a list (or possibly a matrix, dataframe, or vector, depending). lm(), for example, produces a list.

You return something either by using the return() statement explicity, or implicitly: R will take the last line of the function to be the return value. As a result, the last line of your function should usually not be an assignment statement: you'll get a warning about that. Here's an example of a return value:

# some code including the header is above this point....
if (total < 0)                      # there's a problem
    return (-1)                     # return some value
next.thing <- continue.processing() 
remarks <- "Everything worked"     
return (list (Out = next.thing, Comment = remarks))

The output from this function will be a list with two components, named Out and Comment.

The output of the function gets assigned if you ask:

> sin.of.2 <- sin (2) # number is returned and assigned; nothing prints out

Otherwise it will print to the screen. If you don't want to return anything you can certainly return(NULL), but that NULL will still print out. If you really don't want anything to print, use invisible(): this says "if I assign the output, go ahead and assign it, but if I don't assign it, don't let it print." So, for example, consider this function:

> log.invis <- function (x) { invisible (log(x))}
> log.invis (3)      # Nothing happens!
> a <- log.invis (3) # But it gets assigned
> a
[1] 1.098612

Return to R docs