Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coalesce fails when argument is a 1d-array #5557

Closed
FrancoisGuillem opened this issue Oct 13, 2020 · 2 comments
Closed

coalesce fails when argument is a 1d-array #5557

FrancoisGuillem opened this issue Oct 13, 2020 · 2 comments
Labels
feature a feature request or enhancement funs 😆

Comments

@FrancoisGuillem
Copy link

FrancoisGuillem commented Oct 13, 2020

I have a table with a column that has been created by some python code using package reticulate. When I try to use coalesce on that column with last version of dplyr, I get the following error:

Can't coalesce matrices or arrays.

With a previous version of dplyr, I didn't have that error. After some investigation, I have found that the error comes from the fact that the column is of class array and has the "dim" attribute. So it fails because of the following test in the coalesce code:

if (!is_null(attr(x, "dim"))) {
     abort("Can't coalesce matrices or arrays.")
}

Code to reproduce the issue:

x <- array(1:10)
coalesce(x, 0) # Does not work

# If we remove the "dim" attribute. it works
attr(x, "dim") <- NULL
coalesce(x, 0)
@courtiol
Copy link
Contributor

@FrancoisGuillem, since {dplyr} v1.0, the package uses {vctrs} internally (a package handling operation on vectors).
The consequences is that {dplyr} has become stricter when it comes to variable class/types.
Here your attr(x, "dim") <- NULL turn the array into a vector and as {vctrs} can work with vectors it works.
But if you don't turn your input into a vector it does not work because {dplyr} is not designed to work with arrays (and I don't see why it should).
In your case, you should simply do:

coalesce(as.numeric(x), 0)

@FrancoisGuillem
Copy link
Author

FrancoisGuillem commented Oct 19, 2020

Hello @courtiol ,

The code of coalesce is:

function (...) 
{
   if (missing(..1)) {
       abort("At least one argument must be supplied.")
   }
   values <- list2(...)
   values <- vec_cast_common(!!!values)
   values <- vec_recycle_common(!!!values)
   x <- values[[1]]
   values <- values[-1]
   if (!is_null(attr(x, "dim"))) {
       abort("Can't coalesce matrices or arrays.")
   }
   if (is.data.frame(x)) {
       df_coalesce(x, values)
   }
   else {
       vec_coalesce(x, values)
   }
}

If I skip the test that fails and run directly the line vec_coalesce(...), it works:

x <- array(1:10)
values <- list(rep(0, 10))
dplyr:::vec_coalesce(x, values)
## [1]  1  2  3  4  5  6  7  8  9 10

An 1D-array is the same thing as a vector except it has the "dim" attribute, so I still think it should work in this specific case. I think that checking if there is more than one dimension should solve the issue:

 if (!is_null(attr(x, "dim")) && length(attr(x, "dim")) > 1) {
        abort("Can't coalesce matrices or arrays.")
    }

More generally, in a "phylosophical" point of view, arrays and matrices are simply vectors with a "dim" attribute, so I don't see why this kind of function should not work on them. Many R functions that accept vectors also accept arrays and matrices.

@hadley hadley added feature a feature request or enhancement funs 😆 labels Nov 16, 2020
dlindelof added a commit to dlindelof/dplyr that referenced this issue Jun 18, 2021
coalesce() will work with arrays provided their dimension is not greater than 1.
romainfrancois pushed a commit that referenced this issue Jun 29, 2021
* coalesce() support 1-D arrays (#5557)

coalesce() will work with arrays provided their dimension is not greater than 1.

* Update NEWS.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement funs 😆
Projects
None yet
Development

No branches or pull requests

4 participants