# The Basics of R

#### Introduction

Now that we have R installed, let’s start scratching the surface by learning about its basic functionality. In this tutorial, you will be learning:

• How to use R as a calculator

• How to create objects

• A couple of different data types in R

• What a function is and how to use them

• How to convert one data type to another

• How to create a vector

• How to load a package

#### Basic Syntax Rules

Before we start writing any code, it’s important to understand some basic aspects of how R’s code works:

• R is case-sensitive.

• `a` has a different meaning than `A` within R, so you’ll want to use consistent capitalization throughout.
• We execute commands through the use of functions, which typically end with a parenthetical.

• For example, `mean()` is a function that tells R to calculate the mean from a list of numbers.
• You can add comments to a line of code by appending a pound sign before the start of the comment; this is useful for documenting specific lines code

• For example, consider this line of code: `2+3 # This adds two and three`. The command `2+3` will be executed by R but what follows the pound sign (`# This adds two and three`) will be ignored.

By default, R will use your “home directory” as the “working directory.” (For me, that might be `C:\Users\Rodrigo` in Windows and `/Users/rodrigo` in macOS.)

To make it easier to organize your files, I strongly recommend creating a new “project” for each assignment. You can create a project by going to `File`, `New Project`. When you open that project, it will set the working directory to the folder you created for that specific assignment or exercise.

You can manually set your working directory in RStudio by using the `Files` tab in the bottom-right pane, going to the desired folder, clicking on the `More` icon at the top of the Files tab, and selecting `Set As Working Directory`.

Working directories are important because they will keep your files in a predictable place. For example, if you have set your working directory (or created a project), you’ll know where to look for a file when we create it using R.

#### Using R to do Math

R is an extremely powerful calculator. To perform any mathematical operation, just type in the equation into your code chunk (or straight into the console).

In R, you’ll use `+` and `-` for addition and subtraction. For multiplication and division, you’ll use `*` and `/`, respectively. You’ll use `^` for exponents and parentheses (`(` and `)`) to organize the order of operations. (Remember, Please Excuse My Dear Aunt Sally?)

For example, this is how we would compute two plus three:

``````2 + 3
``````
``````##  5
``````

If we wanted to calculate the result of four plus two, times eight, we would write:

``````(4+2)*8
``````
``````##  48
``````

#### Creating Objects in R

One nice thing about working with R is that it allows you to store information into objects that we can refer back to later on.

We can do this with the following syntax: `object <- operation`.

What this tells R is: perform an operation and assign the output into a named object. (The left arrow, `<-`, is called the assignment operator.)

Let’s show this off by assigning the operation `4+2` to an object called `a`. Then, we can just call `a` whenever we want the result of that operation.

``````a <- 4 + 2
a
``````
``````##  6
``````
##### Using objects in operations

If we want to multiply the result of that operation by `8`, we can just do:

``````a <- 4 + 2
a*8
``````
``````##  48
``````
##### Doing Math With Variables

If we wanted to subtract my favorite number (`rodrigos_fave`) from your’s (`your_fave`), here’s how we’d do that:

``````rodrigos_fave <- 36
your_fave <- 7 # Replace 7 with your favorite number
your_fave - rodrigos_fave
``````
``````##  -29
``````

#### Data Types in R

R has a handful of different data types. We’ll cover these types as they come up but we’ll start with two very important ones. The first type is `numeric` (`num`) and it refers to real or decimal numbers. The second is `character` (`chr`) and it covers text (strings). It’s important to understand that if an object is stored as a string, you cannot perform a mathematical operation directly on it.

Here is an illustration of the above example. Notice how the numbers stored in the object `your_fave` are wrapped in quotation marks (`"`). The quotation marks make it a text value (i.e., character data type).

``````rodrigos_fave <- 36
your_fave <- "7"
your_fave - rodrigos_fave
``````
``````## Error in your_fave - rodrigos_fave: non-numeric argument to binary operator
``````

Unsurprisingly, R will give us an error because we are trying to perform a mathematical operation (subtraction) using a non-numeric object.

However, we can often translate between data types (e.g., from numeric to character and vice versa). We just need to make sure we do so explicitly before running an operation. Before we convert between types, let’s take a step back and learn about “functions.”

#### Functions

One of the great things about R is that it gives us a multitude of functions we can use to perform myriad operations. You can think of a function as a wrapper (a single command) that executes a series of commands that allow you to do things to your data.

Some of these functions come with R, others can be installed via optional packages (more on that shortly). You can also create your own functions at any point to help you complete repetitive tasks. (We won’t create functions in this tutorial, though.)

For now, there are two parts of a function that you need to remember about functions. The first is the function name (what we use to call the function) and the second are the arguments (what options the function should use, what objects it should be applied on/with, etc.).

Functions are expressed by the function name followed by a parenthetical that includes the arguments you want to supply it with (i.e., `function_name(arguments)`).

##### Checking the data type of an object

For example, we can check the data type of an object by using the `str()` function. The `str()` function requires us to specify one argument: the name of the object we want to check.

Let’s do that for the two objects we defined:

``````rodrigos_fave <- 36
str(rodrigos_fave)
``````
``````##  num 36
``````
``````your_fave <- "7"
str(your_fave)
``````
``````##  chr "7"
``````

Our code chunk gave us two lines of output, one for each operation we ran. The first line tells us `rodrigos_fave` is `num` (numerical). The second tells us `your_fave` is `chr` (character).

##### Using the help system

RStudio has a `Help` tab, usually on the bottom-right panel. After you click on it, you can get help for every single loaded function by clicking the text box with a loupe on it. Just type the name of the function and it will describe it, list all the arguments it accepts, and provide some examples.

R’s help system is extremely useful, though it can be a bit daunting at first because of all the information it provides you. You’ll also find yourself Googling a lot and ending up on websites like Stack Overflow and Quora, in addition to different developer blogs.

#### Changing Between Types

One nifty way to move between numerical and character types is to use the `as.numeric()` and `as.character()` functions.

Here’s how we’d turn the value of `your_fave` from `chr` to `num`:

``````your_fave <- "7"
as.numeric(your_fave)
``````
``````##  7
``````

Notice how the quotation marks are now gone, suggesting it is now being treated as a number.

##### Using multiple functions in one operation

We can confirm the conversion succeeded by wrapping the `as.numeric()` function within the `str()` function to effectively perform two operations in one sequence:

``````your_fave <- "7"
str(as.numeric(your_fave))
``````
``````##  num 7
``````

It is important to note here that our original `your_fave` object remains of `chr` type. That is because we never re-assigned the result of the operation back onto the original object.

##### Re-assigning objects into themselves

If we want to permanently change it back into a `num` object, we’ll need to recreate the object. In this block of code, we’ll assign the result of the `as.numeric()` operation back into our original object of `your_fave`. We’ll then double-check it by using the `str()` function.

``````your_fave <- "7"
your_fave <- as.numeric(your_fave)
str(your_fave)
``````
``````##  num 7
``````

#### Vectors

R also has different data structures for its objects. We’ll cover different structures throughout the course as the need arises. One very important data structure to know about now is the vector.

Vectors allow us to store multiple values of the same data type into a single object. (We can’t mix numbers and text within a single vector. If there’s a single `chr` element in the vector, R will automatically make all the elements `chr`.)

We can create a vector using the `c()` (concatenate) function. With `c()`, each argument will be a different element that we’re adding to that object. (Each argument is separated by a comma.)

For example, here is how we would create a vector comprised of the following five numbers: `1`, `5`, `7`, `5`, and `22`.

``````c(1, 5, 7, 5, 22)
``````
``````##   1  5  7  5 22
``````
##### Performing operations on vectors

Vectors are useful for a number of different operations. To illustrate, I’ll start by storing that vector into an object (`rodrigos_vector`):

``````rodrigos_vector <- c(1, 5, 7, 22, 5)
str(rodrigos_vector)
``````
``````##  num [1:5] 1 5 7 22 5
``````

First, take note that this is a numeric vector, as shown by the `num`. Second, we can see that there are five elements in our vector (`[1:5]`). This is helpful because we can pick out specific elements within a vector by subscripting. (More on that later.) But this should explain the `` in some of the earlier output: we were actually getting a single-element vector as the result of the mathematical operations we ran earlier on.

Now, let’s divide `rodrigos_vector` by 2:

``````rodrigos_vector/2
``````
``````##   0.5  2.5  3.5 11.0  2.5
``````

As we can see in our output, each element in `rodrigos_vector` was divided by two.

##### Applying functions to vectors

Even more useful than that is the fact that I can pair a vector with functions like `mean()` and `max()` to take the mean from that sequence of numbers and identify the highest number within it.

For example, here’s how we can take the mean of `rodrigos_vector`:

``````rodrigos_vector <- c(1, 5, 7, 22, 5)
mean(rodrigos_vector)
``````
``````##  8
``````

Our output gives us a single number (specifically, a vector comprised of a single element) that represents the mean of the original `rodrigos_vector`.

#### Packages

##### Installing packages

One of the great things about R is that it is modular and has a huge community supporting it. What this means is that anyone can add functionality to R and share that functionality with the rest of the world. We call those modules packages, which will give us access to new functions.

We are going to make extensive use of a small set of packages throughout this book. To use a package, we first need to install it. To install a package, click on `Tools` and `Install Packages`. (Unless I tell you otherwise, keep CRAN selected for the repository, don’t change the install directory, and make sure “install dependencies” is checked.)

One key package that we will be using frequently is called `tidyverse`, which is actually a metapackage that contains a number of very helpful packages, such as `readr` (which helps R intelligently read CSV files), `dplyr` (which helps us organize and slice up data), and `ggplot2` (which helps us create some data visualizations).

Installing a package is just the first step, though. You’ll always need to load a package prior to using one of its functions — and that includes when you restart RStudio, if you don’t save your workspace.

You can load a package by using the `library()` function, supplying the name of the package you want to load as an argument (i.e., putting it within the parentheses).

For example, we can now load the `tidyverse` package (and, consequently, `readr`) by doing the following:

``````library(tidyverse)
``````
``````## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.3     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   2.0.1     ✓ forcats 0.5.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
``````

Oftentimes, nothing will appear to have happened when you execute the code to load a library. That’s because we haven’t used it for anything yet. Sometimes, it will give you a warning message, as is the case here. This warning is fine — it’s just telling us the different `tidyverse` packages being loaded (e.g., `ggplot2`, `tibble`, etc.), and is telling us that there are a couple of small conflicts. We don’t need to worry about these for now.

##### Package::function() notation

Going forward, I may refer to a function as follows: `readr::read_csv()`.

What this means is that we’ll use the `read_csv()` function that is part of the `readr` package.

Put another way, without loading `readr` (by itself or through `tidyverse`), you won’t be able to access `read_csv()`.

This notation is also important because, sometimes, two different packages will use the same name for a function that does different things. (After all, anyone can create a package and do so without knowing the function names that others have used.) So, if I were to use the `read_csv()` function and I only loaded the `readr` package, R would reason that I want to use the function from `readr`. However, if I had also loaded a package called `another_csv_reader` that also had a function named `read_csv()`, R would simply call the function from the most recently loaded package.

If I want to explicitly tell R which package to load the function from, I would use the `package name::function()` notation in my code.