- Borrow & Request
- Collections
- Help
- Meet & Study Here
- Tech & Print
- About
You will need to first download and install both R and RStudio; both are free to use.
In RStudio, the main windows for typing commands will be on the left side - the console and scripts boxes.
To begin a new script file, click on the top left button of a white box with a green plus sign - in the drop-down menu, press R Script.
The >
symbol in the console indicates to the user that RStudio is ready for commands; R code goes after this symbol and the enter/return button will run that line.
For example, try running 2 + 2
in the console after the >
.
To work with these more complex queries, R allows the user to call functions to run a pre-defined set of steps using a shorter line of code with input from the user through arguments.
The structure of working with functions in R is:
functionname(argument1 = value, argument2 = othervalue, ...)
For example:
read.csv(file = "dungeness_crab1.csv", header = TRUE)
You can also include comments in your R Script file by using the #
sign:
# this is a comment and will not be run as code
If you are typing code in your script file, you'll need to manually choose any lines you want to run (as compared to pressing enter/return when coding in the console); use the button titled "run" after highlighting which specific lines of code you want R to evaluate.
Variables , much like x and y in algebra, are containers for data we define and name. In R, variables can hold many different types of data.
numeric |
A number with a decimal. Examples: 25.32, 30.0, 222.8 |
integer |
A number without a decimal. Examples: 1, 5, 8503 |
logical |
Evaluated with logical operators. Examples: TRUE, FALSE |
character |
Not evaluated as a number. Examples: "Welcome", "fifty-five", "55" |
dataframe |
A matrix (tabular data) |
Tips for naming variables:
data
isn't very useful when you have more than a few lines of code)The <-
operator is how we assign values to variables. R will evaluate whatever is on the right side of the arrow first, and assign that to the object defined on the left side.
For example:
# I am assigning the value 55 to a variable named temperature_C
temperature_C <- 55
Exercise: create a variable named canopy_height and assign the value 76.8 to it.
Exercise: create a variable named my_friend and assign a name to it. Check what kind of variable it is by runningclass(my_friend)
.
For numeric and integer variables, we can perform mathematical and statistical operations with the variable.
For example:
2.2 * canopy_height
To read in tabular data, we can use the function read.csv()
. This expression tells R to read a .csv file defined in the arguments, and accepts more optional arguments such as whether the file has a header (header = TRUE
).
?functionname
to get help.
The read.csv()
function simply tells R to read the .csv file included in the arguments - when we run just that function, R will display the contents of the file. If we want to use the data and include it in our R project, we need to assign the data to a variable.
For example:
forestA_2020 <- read.csv(file = "filename.csv", header. = TRUE)
Now we can look at the first few lines of our dataframe variable (and any headers) by calling head(). In this example: head(forestA_2020)
.
Summary statistics functions
mean(data) |
Returns the mean value of data |
sd(data) |
Returns the standard deviation of data |
median(data) |
Returns the median value of data |
length(data) |
Returns the number of elements (length) of data |
summary(data) |
Returns the minimum, median, mean, maximum, and interquartile range of data |
We have been working with "base R" functions - what is already included in RStudio. There are many different packages that we can download as expansion packs for more complex and specific tools, including statistical analysis, publication graphics, and data manipulation.
To load packages, we use two functions: install.packages()
and library()
. install.packages()
only needs to be run once on your machine when you first download the package. As long as R is still downloaded on your computer, you don't need to run this again. In contrast, every time you open up RStudio again, you'll need to load the package into your current project/script using library()
.
For example:
install.packages("somepackagename")
library(somepackagename)
Another useful function in base R is the c()
function; this function allows us to combine a series of values to make a vector. Vectors are a structure of variable data, not a variable class itself.
Example:
vector1 <- c(2, 3, 1, 6, 4, 2, 3, 7)
There are often multiple elements within vectors. We can extract specific elements using square brackets [ ]
and defining the position we are looking for.
For example:
vector1[3]
returns the value 1
, which is the third element in our vector. In R, indexing for vectors begins at element number 1.
We can get multiple elements by using the c()
funciton:
vector1[c(1, 5, 6)]
We can think of dataframes in R as stacks of vectors, so we can use the same operators to subsection our data.
Now, instead of one number in our square brackets, we'll need to use two - one for the row position and one for the column position.
For example:
dataframe[row, column]
To select all rows and a specific column: dataframe[ , 1]
All columns and a specific row: dataframe[1, ]
For dataframes with header names, we can also call on specific columns using the header.
For example:
forest$treenumber
From here, we can use summarizing statistics:
max(forest$treenumber)
mean(forest$treenumber)
Most of the exercises and instructions used in this guide have been obtained from these resources:
An Introduction to R, Alex Douglas, Deon Roos, Francesca Mancini, Ana Couto & David Lusseau.
Programming with R, lesson from Software Carpentry.
An Introduction to R, W. N. Venables, D. M. Smith and the R Core Team.
You can ask me!
121 The Valley Library
Corvallis OR 97331–4501
Phone: 541-737-3331