3  Operators & Data Types

Learning Objectives

  • Be able to use R as a calculator.
  • Be able to compare values in R.
  • Know the distinctions between how R handles different types of data types (numbers, strings, and logicals).

Suggested Readings

3.1 R as a calculator

You can do a ton of things with R, but at its core it’s basically a fancy calculator. Let’s get started with some basic arithmetic!

3.1.1 Doing basic math

R handles simple arithmetic using the following arithmetic operators:

operation operator example input example output
addition + 10 + 2 12
subtraction - 9 - 3 6
multiplication * 5 * 5 25
division / 9 / 3 3
power ^ 5 ^ 2 25

The first four basic operators (+, -, *, /) are pretty straightforward and behave as expected:

7 + 5 # Addition
#> [1] 12
7 - 5 # Subtraction
#> [1] 2
7 * 5 # Multiplication
#> [1] 35
7 / 5 # Division
#> [1] 1.4

Not a lot of surprises (you can ignore the [1] you see in the returned values…that’s just R saying there’s only one value to return).

Powers (i.e. \(x^n\)) are represented using the ^ symbol. For example, to calculate \(5^4\) in R, we would type:

5^4
#> [1] 625

3.1.2 Slightly more tricky math

There are two other operators that are not typically as well-known as the first five but are quite common in programming:

operation operator example input example output
integer division %/% 4 %/% 3 1
modulus %% 8 %% 3 2

3.1.2.1 Integer division

Integer division is division in which the remainder is discarded. Note the difference between regular (/) and integer (%/%) division:

4 / 3 # Regular division
#> [1] 1.333333
4 %/% 3 # Integer division
#> [1] 1

With integer division, 3 can only go into 4 once, so 4 %/% 3 returns 1.

With integer division, dividing a number by a larger number will always produce 0 (because the larger number cannot go into the smaller number):

4 %/% 5 # Will return 0
#> [1] 0

3.1.2.2 The Modulus operator

The modulus (aka “mod” operator) returns the remainder after doing integer division. For example:

17 %% 3
#> [1] 2

This returns 2 because because 17 / 3 is equal to 5 with a remainder of 2. The modulus returns any remainder, including decimals:

3.1415 %% 3
#> [1] 0.1415

If you mod a number by itself, you’ll get 0 (because there’s no remainder):

17 %% 17 # Will return 0
#> [1] 0

Finally, if you mod a number by a larger number, you’ll get the smaller number back since it’s the remainder:

17 %% 20 # Will return 17
#> [1] 17

3.1.3 Tricks with %% and %/%

The %% and %/% operators can be really handy. Here are a few tricks.

3.1.3.1 Odds and evens with n %% 2

You can tell if an integer n is even or odd by using m %% 2. If the result is 0, n must be even (because 2 goes in evenly to even numbers with no remainder). If n is odd, you’ll get a remainder of 1. Here’s an example:

10 %% 2 # Even
#> [1] 0
11 %% 2 # Odd
#> [1] 1

This trick also works with negative numbers!

-42 %% 2 # Even
#> [1] 0
-43 %% 2 # Odd
#> [1] 1

3.1.4 Number “chopping” with 10s

When you use the mod operator %% on a positive number with factors of 10, it “chops” the number and returns everything to the right of the “chop” point:

123456 %% 1 # Chops to the right of the *ones* digit
#> [1] 0
123456 %% 10 # Chops to the right of the *tens* digit
#> [1] 6
123456 %% 100 # Chops to the right of the *hundreds* digit
#> [1] 56

Integer division %/% works the same way, except it returns everything to the left of the “chop” point:

123456 %/% 1 # "Chops to the right of the ones digit
#> [1] 123456
123456 %/% 10 # "Chops to the right of the tens digit
#> [1] 12345
123456 %/% 100 # "Chops to the right of the hundreds digit
#> [1] 1234

This trick works with non-integers too!

3.1415 %% 1
#> [1] 0.1415
3.1415 %/% 1
#> [1] 3

But be careful - this “trick” only works with positive numbers:

-123.456 %% 10
#> [1] 6.544
-123.456 %/% 10
#> [1] -13

Here’s some mental notes to remember how this works:

  • %% returns everything to the right (<chop> ->)
  • %/% returns everything to the left (<- <chop>)
  • The “chop” point is always just to the right of the chopping digit:
Example “Chop” point “Chop” point description
1234 %% 1 1234 | Right of the 1’s digit
1234 %% 10 123 | 4 Right of the 10’s digit
1234 %% 100 12 | 34 Right of the 100’s digit
1234 %% 1000 1 | 234 Right of the 1,000’s digit
1234 %% 10000 | 1234 Right of the 10,000’s digit

3.2 Comparing things in R

Other than simple arithmetic, another common programming task is to compare different values to see if one is greater than, less than, or equal to the other. R handles comparisons with relational and logical operators.

3.3 Comparing two things

To compare two things, use the following relational operators:

  • Less than: <
  • Less than or equal to : <=
  • Greater than or equal to: >=
  • Greater than: >
  • Equal: ==
  • Not equal: !=

The less than operator < can be used to test whether one number is smaller than another number:

2 < 5
#> [1] TRUE

If the two values are equal, the < operator will return FALSE, while the <= operator will return TRUE: :

2 < 2
#> [1] FALSE
2 <= 2
#> [1] TRUE

The “greater than” (>) and “greater than or equal to” (>=) operators work the same way but in reverse:

2 > 5
#> [1] FALSE
2 > 2
#> [1] FALSE
2 >= 2
#> [1] TRUE

To assess whether two values are equal, we have to use a double equal sign (==):

(2 + 2) == 4
#> [1] TRUE
(2 + 2) == 5
#> [1] FALSE

To assess whether two values are not equal, we have to use an exclamation point sign with an equal sign (!=):

(2 + 2) != 4
#> [1] FALSE
(2 + 2) != 5
#> [1] TRUE

It’s worth noting that you can also apply equality operations to “strings,” which is the general word to describe character values (i.e. not numbers). For example, R understands that a "penguin" is a "penguin" so you get this:

"penguin" == "penguin"
#> [1] TRUE

However, R is very particular about what counts as equality. For two pieces of text to be equal, they must be precisely the same:

"penguin" == "PENGUIN"        # FALSE because the case is different
#> [1] FALSE
"penguin" == "p e n g u i n"  # FALSE because the spacing is different
#> [1] FALSE
"penguin" == "penguin "       # FALSE because there's an extra space on the second string
#> [1] FALSE

3.4 Making multiple comparisons

To make a more complex comparison of more than just two things, use the following logical operators:

  • And: &
  • Or: |
  • Not: !

And:

A logical expression x & y is TRUE only if both x and y are TRUE.

(2 == 2) & (2 == 3) # FALSE because the second comparison if not TRUE
#> [1] FALSE
(2 == 2) & (3 == 3) # TRUE because both comparisons are TRUE
#> [1] TRUE

Or:

A logical expression x | y is TRUE if either x or y are TRUE.

(2 == 2) | (2 == 3) # TRUE because the first comparison is TRUE
#> [1] TRUE

Not:

The ! operator behaves like the word “not” in everyday language. If a statement is “not true”, then it must be “false”. Perhaps the simplest example is

!TRUE
#> [1] FALSE

It is good practice to include parentheses to clarify the statement or comparison being made. Consider the following example:

!3 == 5
#> [1] TRUE

This returns TRUE, but it’s a bit confusing. Reading from left to right, you start by saying “not 3”…what does that mean?

What is really going on here is R first evaluates whether 3 is equal to 5 (3 == 5), and then returns the “not” (!) of that. A better version of the same thing would be:

!(3 == 5)
#> [1] TRUE

3.4.1 Order of operations

R follows the typical BEDMAS order of operations. That is, R evaluates statements in this order1:

  1. Brackets
  2. Exponents
  3. Division
  4. Multiplication
  5. Addition
  6. Subtraction

For example, if I type:

1 + 2 * 4
#> [1] 9

R first computes 2 * 4 and then adds 1. If what you actually wanted was for R to first add 2 to 1, then you should have added parentheses around 1 and 2:

(1 + 2) * 4
#> [1] 12

A helpful rule of thumb to remember is that brackets always come first. So, if you’re ever unsure about what order R will do things in, an easy solution is to enclose the thing you want it to do first in brackets.

Note that for logical operators, the order precedence is ! > & > |

For example, consider the following statement:

TRUE | FALSE & FALSE
#> [1] TRUE

This returns TRUE because the & statement (FALSE & FALSE) is evaluated first, so the whole statement simplifies to TRUE | FALSE, which returns TRUE. If you put parentheses around the | statement, it would evaluate first and the whole statement would return FALSE:

(TRUE | FALSE) & FALSE
#> [1] FALSE

Similarly, consider the following statement:

! TRUE | TRUE
#> [1] TRUE

This returns TRUE because the ! statement is evaluated first (! TRUE is FALSE), and the simplified statement FALSE | TRUE returns TRUE. Again, if you put parentheses around the | statement the whole statement becomes FALSE:

! (TRUE | TRUE)
#> [1] FALSE

3.5 Data types

Every programming language has the ability to store data of different types. R recognizes several important basic data types (there are others, but these cover most cases):

Type Description Example
double Number with a decimal place (aka “float”) 3.14, 1.61803398875
integer Number without a decimal place 1, 42
character Text in quotes (aka “string”) "this is some text", "3.14"
logical True or False (for comparing things) TRUE, FALSE

If you want to check with type a value is, you can use the function typeof(). For example:

typeof("hello")
#> [1] "character"

3.5.1 Numeric types

Numbers in R have the numeric data type, which is also the default computational type. There are two types of numbers:

  • Integers
  • Non-integers (aka “double” or “float”)

The difference is that integers don’t have decimal values. A non-integer in R has the type “double”:

typeof(3.14)
#> [1] "double"

By default, R assumes all numbers have a decimal place, even if it looks like an integer:

typeof(3)
#> [1] "double"

In this case, R assumes that 3 is really 3.0. To make sure R knows you really do mean to create an integer, you have to add an L to the end of the number2:

typeof(3L)
#> [1] "integer"

3.5.2 Character types

A character value is used to represent string values in R. Anything put between single quotes ('') or double quotes ("") will be stored as a character. For example:

typeof('3')
#> [1] "character"

Notice that even though the value looks like a number, because it is inside quotes R interprets it as a character. If you mistakenly thought it was a a number, R will gladly return an error when you try to do a numerical operation with it:

'3' + 7
#> Error in "3" + 7: non-numeric argument to binary operator

It doesn’t mattef if you use single or double quotes to create a character. The only time is does matter is if the character is a quote symbole itself. For example, if you wanted to type the word "don't", you should use double quotes so that R knows the single quote is part of the character:

typeof("don't")
#> [1] "character"

If you used single quotes, you’ll get an error because R reads 'don' as a character:

typeof('don't')
#> Error: <text>:1:13: unexpected symbol
#> 1: typeof('don't
#>                 ^

We will go into much more detail about working with character values later on in Week 7.

3.5.3 Logical types

Logical data only have two values: TRUE or FALSE. Note that these are not in quotes and are in all caps.

typeof(TRUE)
#> [1] "logical"
typeof(FALSE)
#> [1] "logical"

R uses these two special values to help answer questions about logical statements. For example, let’s compare whether 1 is greater than 2:

1 > 2
#> [1] FALSE

R returns the values FALSE because 1 is not greater than 2. If I flip the question to whether 1 is less than 2, I’ll get TRUE:

1 < 2
#> [1] TRUE

3.5.4 Special values

In addition to the four main data types mentioned, there are a few additional “special” types: Inf, NaN, NA and NULL.

Infinity: Inf corresponds to a value that is infinitely large (or infinitely small with -Inf). The easiest way to get Inf is to divide a positive number by 0:

1/0
#> [1] Inf

Not a Number: NaN is short for “not a number”, and it’s basically a reserved keyword that means “there isn’t a mathematically defined number for this.” For example:

0/0
#> [1] NaN

Not available: NA indicates that the value that is “supposed” to be stored here is missing. We’ll see these much more when we start getting into data structures like vectors and data frames.

No value: NULL asserts that the variable genuinely has no value whatsoever, or does not even exist.

Page sources

Some content on this page has been modified from other courses, including:


  1. For a more precise statement, see the operator precedence for R.↩︎

  2. Why L? Well, it’s a bit complicated, but R supports complex numbers which are denoted by i, so i was already taken. A quick answer is that R uses 32-bit long integers, so L for “long”.↩︎