library(rray)
The idea for rray sprung from frustration with the following example:
x <- matrix(1:6, nrow = 3)
x
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#> [3,] 3 6
x + 1
#> [,1] [,2]
#> [1,] 2 5
#> [2,] 3 6
#> [3,] 4 7
x + matrix(1)
#> Error in x + matrix(1): non-conformable arrays
To someone who has worked with matrices in R before, this error message is probably nothing new. It’s stating that because x
has dimensions (3, 2)
and is being added to an object with dimensions (1, 1)
, which is a matrix does not have exactly the same dimensions, the operation cannot be completed. My frustration lies with the fact that I had this “feeling” that I knew the answer to this operation. It shouldn’t be an error, it should have the same answer as x + 1
. Why does this work, when the other doesn’t?
In one sentence, the answer is that R recycles, but I want it to broadcast.
This vignette’s goal is to introduce the concept of broadcasting. Broadcasting is the idea of repeating dimensions of one object to match the dimensions of another, so that an operation can be applied between the two inputs. Later you will learn two rules which formalize this idea.
For this first example, I expected this result:
x
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#> [3,] 3 6
x + 1
#> [,1] [,2]
#> [1,] 2 5
#> [2,] 3 6
#> [3,] 4 7
y <- rray(1, c(1, 1))
y
#> <rray<dbl>[,1][1]>
#> [,1]
#> [1,] 1
x + y
#> <rray<dbl>[,2][3]>
#> [,1] [,2]
#> [1,] 2 5
#> [2,] 3 6
#> [3,] 4 7
You’ll come to learn that rray implements broadcasting throughout the package, which is what allows this to work. It solves this motivating example, but is generally useful in variety of ways.
So how does this happen? The mental model to understand broadcasting in this case is to alter the dimensions of y
to match the dimensions of x
by duplicating the rows. This is known as broadcasting y. After this alteration is done, then the addition is performed.
y_broadcast <- rray_broadcast(y, c(3, 2))
y_broadcast
#> <rray<dbl>[,2][3]>
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 1 1
#> [3,] 1 1
x + y_broadcast
#> <rray<dbl>[,2][3]>
#> [,1] [,2]
#> [1,] 2 5
#> [2,] 3 6
#> [3,] 4 7
But how did we know (programmatically) how to alter y
to match the dimensions of x
here? That is where the broadcasting rules come in.
Broadcasting works by finding a set of common dimensions from the inputs, and then manipulating each of the inputs to that common shape. There are two rules for finding the common dimensions:
In the next section you will see an example of these rules in action, but an explanation of “recycling” might also be useful. In this case, I am talking about the tidyverse recycling rules. When comparing two objects together, the recycling rules are:
There are slightly different from base R, where partial recycling is also occasionally used, which means that, for example, a length 2 vector could be recycled up to length 4 without issue.
x + y
Let’s revisit the original example, but use the broadcasting rules to understand how to get the result.
When adding these two matrices together, it’s useful to write out the dimensions explicitly, using the general notation of:
(rows, cols) | object
(3, 2) | x
(1, 1) | y
-------|-------
(3, 2) | result
To understand how we get from x + y -> result
, compare dimensions vertically (one at a time), and apply the broadcasting rules.
1st dimension:
x
has 3
rowsy
has 1
rows.2nd dimension:
x
has 2
columny
has 1
columnThe common dimensions have now been determined, (3, 2)
. At this point, the actual “broadcasting” step is applied, which is the transformation of x
and y
to have the shape (3, 2)
. Then the addition is performed. This can be done manually with rray_broadcast()
, and the common dimensions can be determined with rray_dim_common()
.
dim <- rray_dim_common(x, y)
dim
#> [1] 3 2
x_broadcast <- rray_broadcast(x, dim)
x_broadcast
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#> [3,] 3 6
y_broadcast <- rray_broadcast(y, dim)
y_broadcast
#> <rray<dbl>[,2][3]>
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 1 1
#> [3,] 1 1
x_broadcast + y_broadcast
#> <rray<dbl>[,2][3]>
#> [,1] [,2]
#> [1,] 2 5
#> [2,] 3 6
#> [3,] 4 7
rray applies these rules automatically when either object is an rray
, as y
is.
To get this behavior with base R objects, the underlying engine that powers +
has been exposed: rray_add()
. If you are designing a package and don’t know what your inputs are going to be, this might be a good way to ensure you are manipulating arrays and rrays consistently.
If y
was a vector, not a matrix, what would have changed?
z <- rray(c(1L, 2L, 3L))
z
#> <rray<int>[3]>
#> [1] 1 2 3
x + z
#> <rray<int>[,2][3]>
#> [,1] [,2]
#> [1,] 2 5
#> [2,] 4 7
#> [3,] 6 9
This operation is still valid, and represents an important broadcasting idea because it allows lower dimensional objects (1D) to be combined with higher dimensional objects (2D) in a formalized way. To start, write out the dimensions:
(3, 2) | x
(3) | z
To find the common dimensions, Rule 1 is first applied because there are “implicit” dimensions. z
has the 1st dimension (the rows), but is missing the 2nd dimension (the columns). When this occurs, add 1
s to the right hand side until the dimensionalities match.
(3, 2) | x
(3, 1) | z
Now, apply the second rule regarding recycling:
1st dimension:
x
has 3
rows.z
has 3
rows.2nd dimension:
x
has 2
columnsy
has 1
columnThe common dimensions are then (3, 2)
, and again we can manually manipulate the objects to visualize what is happening. For z
, you might find it helpful to visualize the increase in dimensionality and the recycling of dimensions separately.
Broadcasting becomes more interesting when higher dimensional objects are used. Additionally, at this point only 1 dimension has been changing (the columns), and only 1 object at a time needed altering. This example will require both inputs to be changed.
a <- rray(1:3, c(1, 3))
a
#> <rray<int>[,3][1]>
#> [,1] [,2] [,3]
#> [1,] 1 2 3
b <- rray(1:8, c(4, 1, 2))
b
#> <rray<int>[,1,2][4]>
#> , , 1
#>
#> [,1]
#> [1,] 1
#> [2,] 2
#> [3,] 3
#> [4,] 4
#>
#> , , 2
#>
#> [,1]
#> [1,] 5
#> [2,] 6
#> [3,] 7
#> [4,] 8
Perhaps surprisingly, these inputs can be added together using broadcasting.
a + b
#> <rray<int>[,3,2][4]>
#> , , 1
#>
#> [,1] [,2] [,3]
#> [1,] 2 3 4
#> [2,] 3 4 5
#> [3,] 4 5 6
#> [4,] 5 6 7
#>
#> , , 2
#>
#> [,1] [,2] [,3]
#> [1,] 6 7 8
#> [2,] 7 8 9
#> [3,] 8 9 10
#> [4,] 9 10 11
To understand this, write out dimensions and follow the rules, as usual.
(1, 3) | a
(4, 1, 2) | b
Implicit dimensions are present for a
, so make them explicit with 1
s using the first rule.
(1, 3, 1) | a
(4, 1, 2) | b
And then apply the rest of the recycling rules:
1st dimension:
a
has 1
row.b
has 4
rows.2nd dimension:
a
has 3
columns.b
has 1
column.3rd dimension:
a
has 1
element in the 3rd dimension.b
has 2
elements in the 3rd dimension.So the common dimensions are (4, 3, 2)
. Generally, once you internalize the rules enough, it’s easier to write them compactly as:
(1, 3, ) | a
(4, 1, 2) | b
--------- | ------
(4, 3, 2) | result
Here are the incremental changes made to a
to go from (1, 3)
to (4, 3, 2)
.
a
#> <rray<int>[,3][1]>
#> [,1] [,2] [,3]
#> [1,] 1 2 3
rray_broadcast(a, c(1, 3, 1))
#> <rray<int>[,3,1][1]>
#> , , 1
#>
#> [,1] [,2] [,3]
#> [1,] 1 2 3
rray_broadcast(a, c(4, 3, 1))
#> <rray<int>[,3,1][4]>
#> , , 1
#>
#> [,1] [,2] [,3]
#> [1,] 1 2 3
#> [2,] 1 2 3
#> [3,] 1 2 3
#> [4,] 1 2 3
rray_broadcast(a, c(4, 3, 2))
#> <rray<int>[,3,2][4]>
#> , , 1
#>
#> [,1] [,2] [,3]
#> [1,] 1 2 3
#> [2,] 1 2 3
#> [3,] 1 2 3
#> [4,] 1 2 3
#>
#> , , 2
#>
#> [,1] [,2] [,3]
#> [1,] 1 2 3
#> [2,] 1 2 3
#> [3,] 1 2 3
#> [4,] 1 2 3
And here are the changes made to b
to go from (4, 1, 2)
to (4, 3, 2)
.
b
#> <rray<int>[,1,2][4]>
#> , , 1
#>
#> [,1]
#> [1,] 1
#> [2,] 2
#> [3,] 3
#> [4,] 4
#>
#> , , 2
#>
#> [,1]
#> [1,] 5
#> [2,] 6
#> [3,] 7
#> [4,] 8
rray_broadcast(b, c(4, 3, 2))
#> <rray<int>[,3,2][4]>
#> , , 1
#>
#> [,1] [,2] [,3]
#> [1,] 1 1 1
#> [2,] 2 2 2
#> [3,] 3 3 3
#> [4,] 4 4 4
#>
#> , , 2
#>
#> [,1] [,2] [,3]
#> [1,] 5 5 5
#> [2,] 6 6 6
#> [3,] 7 7 7
#> [4,] 8 8 8
Altogether now!
dim <- rray_dim_common(a, b)
a_broadcast <- rray_broadcast(a, dim)
b_broadcast <- rray_broadcast(b, dim)
a_broadcast + b_broadcast
#> <rray<int>[,3,2][4]>
#> , , 1
#>
#> [,1] [,2] [,3]
#> [1,] 2 3 4
#> [2,] 3 4 5
#> [3,] 4 5 6
#> [4,] 5 6 7
#>
#> , , 2
#>
#> [,1] [,2] [,3]
#> [1,] 6 7 8
#> [2,] 7 8 9
#> [3,] 8 9 10
#> [4,] 9 10 11
The only exception to the strict behavior of base R is when an array is combined with a 1D vector. You saw this in the very first example with x + 1
. “Scalar” operations work as one might expect:
x
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#> [3,] 3 6
x * 2
#> [,1] [,2]
#> [1,] 2 8
#> [2,] 4 10
#> [3,] 6 12
But when a vector and an array are combined, the vector is recycled to fit the dimensions of the array. This is very different from broadcasting.
vec_1 <- c(1, 2, 3)
x
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#> [3,] 3 6
x + vec_1
#> [,1] [,2]
#> [1,] 2 5
#> [2,] 4 7
#> [3,] 6 9
# Equivalent to:
x + rep(vec_1, 2)
#> [,1] [,2]
#> [1,] 2 5
#> [2,] 4 7
#> [3,] 6 9
This example seems intuitive and harmless, but even partial recycling is allowed, which can be quite confusing and dangerous.
This is confusingly identical to the following, where vec_2
is repeatedly used to construct something that can be added to x
.
recycled <- matrix(NA, 3, 2)
recycled[1, 1] <- vec_2[1]
recycled[2, 1] <- vec_2[2]
recycled[3, 1] <- vec_2[1]
recycled[1, 2] <- vec_2[2]
recycled[2, 2] <- vec_2[1]
recycled[3, 2] <- vec_2[2]
recycled
#> [,1] [,2]
#> [1,] 1 2
#> [2,] 2 1
#> [3,] 1 2
x + recycled
#> [,1] [,2]
#> [1,] 2 6
#> [2,] 4 6
#> [3,] 4 8
rray never performs partial recycling when it is going through the steps of broadcasting, and will error if this example is attempted.
The most important thing to remember here is that base R’s recycling is not broadcasting even if it looks like it at first glance with simple cases!
In the Python world, NumPy is used for broadcasting operations. The underlying C++ library used for rray is heavily based on the NumPy principles. However, one interesting thing to note is that when applying broadcasting rules for NumPy, implicit dimensions are prepended to the left hand side as 1
s. This is used consistently in the NumPy world, even down to how they are printed. However, R often uses the concept of appended implicit dimensions the right hand side, and prints with this concept in mind.
For a simple example of R doing this, try coercing a vector to a matrix:
The dimensions went from (3)
to (3, 1)
. NumPy actually does the opposite, and would convert (3)
to (1, 3)
. This is fine, because they use it consistently there, but can be confusing when comparing broadcasting of rray to NumPy, and is just something to keep in mind.