This workshop is designed for people with very little to no previous knowledge of R. It assumes that each student will have a computer and will code at the same time as the instructor does. The design of the workshop is inspired by Software Carpentry.

You will need to first download and install both R and RStudio; both are free to use.

- Open your browser and go to https://cran.r-project.org/.
- In the top box, click the link which best describes your operating system. For example, on a PC, click "Download R for Windows."
- Follow instructions for downloading R for your specific system.
- Then go to https://posit.co/download/rstudio-desktop/ to download RStudio.
- Once installed, open RStudio using the icon or clicking on the program in your computer's menu.

In RStudio, the main windows for typing commands will be on the left side - the console and scripts boxes.

To begin a new script file, click on the top left button of a white box with a green plus sign - in the drop-down menu, press R Script.

The `>`

symbol in the console indicates to the user that RStudio is ready for commands; R code goes after this symbol and the enter/return button will run that line.

For example, try running `2 + 2 `

in the console after the `>`

.

R can handle much more complex queries than simple calculations!

To work with these more complex queries, R allows the user to call `functions` to run a pre-defined set of steps using a shorter line of code with input from the user through `arguments`.

The structure of working with functions in R is:

`functionname(argument1 = value, argument2 = othervalue, ...)`

For example:

`read.csv(file = "dungeness_crab1.csv", header = TRUE)`

You can also include comments in your R Script file by using the `#`

sign:

`# this is a comment and will not be run as code`

If you are typing code in your script file, you'll need to manually choose any lines you want to run (as compared to pressing enter/return when coding in the console); use the button titled "run" after highlighting which specific lines of code you want R to evaluate.

`Variables` , much like x and y in algebra, are containers for data we define and name. In R, variables can hold many different types of data.

`numeric` |
A number with a decimal. Examples: 25.32, 30.0, 222.8 |

`integer` |
A number without a decimal. Examples: 1, 5, 8503 |

`logical` |
Evaluated with logical operators. Examples: TRUE, FALSE |

`character` |
Not evaluated as a number. Examples: "Welcome", "fifty-five", "55" |

`dataframe` |
A matrix (tabular data) |

Tips for naming variables:

- Don't start variable names with a number or symbol
- Variable names are case-sensitive
- Don't use spaces in the name (try an underscore or period)
- Be descriptive! (something like
`data`

isn't very useful when you have more than a few lines of code)

The `<-`

operator is how we assign values to variables. R will evaluate whatever is on the right side of the arrow first, and assign that to the object defined on the left side.

For example:

`# I am assigning the value 55 to a variable named temperature_C`

`temperature_C <- 55`

Exercise: create a variable named canopy_height and assign the value 76.8 to it.

Exercise: create a variable named my_friend and assign a name to it. Check what kind of variable it is by running`class(my_friend)`

.

For numeric and integer variables, we can perform mathematical and statistical operations with the variable.

For example:

`2.2 * canopy_height`

To read in tabular data, we can use the function `read.csv()`

. This expression tells R to read a .csv file defined in the arguments, and accepts more optional arguments such as whether the file has a header (`header = TRUE`

).

For any pre-set function in R, we can look up which arguments are required vs. optional, more information about the use of the function, and some examples of use by running

`?functionname`

to get help.

The `read.csv()`

function simply tells R to read the .csv file included in the arguments - when we run just that function, R will display the contents of the file. If we want to use the data and include it in our R project, we need to assign the data to a variable.

For example:

`forestA_2020 <- read.csv(file = "filename.csv", header. = TRUE)`

Now we can look at the first few lines of our dataframe variable (and any headers) by calling head(). In this example: `head(forestA_2020)`

.

**Summary statistics functions**

`mean(data)` |
Returns the mean value of `data` |

`sd(data)` |
Returns the standard deviation of `data` |

`median(data)` |
Returns the median value of `data` |

`length(data)` |
Returns the number of elements (length) of `data` |

`summary(data)` |
Returns the minimum, median, mean, maximum, and interquartile range of `data` |

We have been working with "base R" functions - what is already included in RStudio. There are many different `packages` that we can download as expansion packs for more complex and specific tools, including statistical analysis, publication graphics, and data manipulation.

To load packages, we use two functions: `install.packages()`

and `library()`

. `install.packages()`

only needs to be run once on your machine when you first download the package. As long as R is still downloaded on your computer, you don't need to run this again. In contrast, every time you open up RStudio again, you'll need to load the package into your current project/script using `library()`

.

For example:

`install.packages("somepackagename")`

`library(somepackagename)`

Another useful function in base R is the `c()`

function; this function allows us to combine a series of values to make a `vector`. Vectors are a structure of variable data, not a variable class itself.

Example:

`vector1 <- c(2, 3, 1, 6, 4, 2, 3, 7)`

There are often multiple elements within vectors. We can extract specific elements using square brackets `[ ]`

and defining the position we are looking for.

For example:

`vector1[3]`

returns the value `1`

, which is the third element in our vector. In R, indexing for vectors begins at element number 1.

We can get multiple elements by using the `c()`

funciton:

`vector1[c(1, 5, 6)]`

We can think of dataframes in R as stacks of vectors, so we can use the same operators to subsection our data.

Now, instead of one number in our square brackets, we'll need to use two - one for the row position and one for the column position.

For example:

`dataframe[row, column]`

To select all rows and a specific column: `dataframe[ , 1]`

All columns and a specific row: `dataframe[1, ]`

For dataframes with header names, we can also call on specific columns using the header.

For example:

`forest$treenumber`

From here, we can use summarizing statistics:

`max(forest$treenumber)`

`mean(forest$treenumber)`

Most of the exercises and instructions used in this guide have been obtained from these resources:

An Introduction to R, Alex Douglas, Deon Roos, Francesca Mancini, Ana Couto & David Lusseau.

Programming with R, lesson from Software Carpentry.

An Introduction to R, W. N. Venables, D. M. Smith and the R Core Team.

You can ask me!

