• rray_unique(): the unique values.

  • rray_unique_loc(): the locations of the unique values.

  • rray_unique_count(): the number of unique values.

rray_unique(x, axis)

rray_unique_loc(x, axis)

rray_unique_count(x, axis)

Arguments

x

A vector, matrix, array, or rray.

axis

A single integer. The axis to index x by.

Value

  • rray_unique(): an array the same type as x containing only unique values. The dimensions of the return value are the same as x except on the axis, which might be smaller than the original dimension size if any duplicate entries were removed.

  • rray_unique_loc(): an integer vector, giving locations of the unique values.

  • rray_unique_count(): an integer vector of length 1, giving the number of unique values.

Details

The family of unique functions work in the following manner:

  1. x is split into pieces using the axis as the dimension to index along.

  2. Each of those pieces is flattened to 1D.

  3. The uniqueness test is done between those flattened pieces and the final output is restored from that result.

As an example, if x has dimensions of (2, 3, 2) and axis = 2, then you can think of x as being broken into x[, 1], x[, 2] and x[, 3]. Each of those three pieces are then flattened, and a vctrs unique function is called on the list of those flattened inputs.

The result of calling rray_unique() will always have the same dimensions as x, except along axis, which is allowed to be less than the original axis size if any duplicate entries are removed.

Unlike the duplicate functions, the unique functions only take a singular axis argument, rather than axes. The reason for this is that if the unique functions were defined in any other way, they would allow for ragged arrays, which are not defined in rray.

When duplicates are detected, the first unique value is used in the result.

See also

rray_duplicate_any() for functions that work with the dual of unique values: duplicated values.

vctrs::vec_unique() for functions that detect unique values among any type of vector object.

Examples

x_dup_rows <- rray(c(1, 1, 3, 3, 2, 2, 4, 4), c(2, 2, 2)) x_dup_rows <- rray_set_row_names(x_dup_rows, c("r1", "r2")) x_dup_rows <- rray_set_col_names(x_dup_rows, c("c1", "c2")) # Duplicate rows # `x_dup_rows[1] == x_dup_rows[2]` rray_unique(x_dup_rows, 1)
#> <rray<dbl>[,2,2][1]> #> , , 1 #> #> c1 c2 #> r1 1 3 #> #> , , 2 #> #> c1 c2 #> r1 2 4 #>
# Duplicate cols # `x_dup_cols[, 1] == x_dup_cols[, 2]` x_dup_cols <- rray_transpose(x_dup_rows, c(2, 1, 3)) rray_unique(x_dup_cols, 2)
#> <rray<dbl>[,1,2][2]> #> , , 1 #> #> r1 #> c1 1 #> c2 3 #> #> , , 2 #> #> r1 #> c1 2 #> c2 4 #>
# Duplicate 3rd dim # `x_dup_layers[, , 1] == x_dup_layers[, , 2]` x_dup_layers <- rray_transpose(x_dup_rows, c(2, 3, 1)) rray_unique(x_dup_layers, 3)
#> <rray<dbl>[,2,1][2]> #> , , r1 #> #> [,1] [,2] #> c1 1 2 #> c2 3 4 #>
# rray_unique_loc() returns an # integer vector you can use # to subset out the unique values along # the axis you are interested in x_dup_cols[, rray_unique_loc(x_dup_cols, 2L)]
#> <rray<dbl>[,1,2][2]> #> , , 1 #> #> r1 #> c1 1 #> c2 3 #> #> , , 2 #> #> r1 #> c1 2 #> c2 4 #>
# Only 1 unique column rray_unique_count(x_dup_cols, 2L)
#> [1] 1
# But 2 unique rows rray_unique_count(x_dup_cols, 1L)
#> [1] 2