0

I have a dataframe with some numeric, some integer and some factor columns. I'm trying to transform the dataframe to square just the numeric columns but the solutions in this thread aren't working in this use case:

square <- function(x){return(x^2)}
Numerics <- function(df){return(Filter(is.numeric,df))}
SquareD <- function(df){
  Numerics(df) <- apply(Numerics(df),2,square)
  return(df)
}

Now, on trying to run SquareD(iris), I get ' Error in Numerics(df) <- apply(Numerics(df), 2, square) : could not find function "Numerics<-" '.

How can I get this to work? Something like

iris[sapply(df,is.numeric)] <- apply(iris[sapply(iris,is.numeric)],2,square)

does actually work but it's long and clunky. I would much rather something short (wrapped in one function) that I could repeat instead. But

Numerics <- function(df){return(df[sapply(df,is.numeric)])}
SquareD <- function(df){
  Numerics(df) <- apply(Numerics(df),2,square)
  return(df)
}

still doesn't work. Stuff that doesn't use a newly defined function, but is still somewhat short, like

SquareD <- function(df){
  Filter(is.numeric,df) <- apply(Filter(is.numeric,df),2,square)
  return(df)
}

doesn't work either. (or e.g. dplyr::select_if(df,is.numeric) in place of Filter(is.numeric,df) above)

Note: I want to do this as shown above, i.e. with a method that would work for both replacement and selection (hence why I'm trying using the methods for selection suggested in that thread) and is short without having to rewrite somewhat lengthy code (like with the sapply). For example, I might want to replace the numerical columns of ANOTHER dataset with squared values from numerical columns of iris. That kind of application.

I know that for replacement alone I could use dplyr::mutate_if but I don't want that. Rather looking to understand why the select methods don't work here and one can adapt them to. I also want to do it in one line (or with a predefined function that is finally executed in one line) as above. Finally, no libraries but dplyr please.

Mobeus Zoom
  • 598
  • 5
  • 19

2 Answers2

3

If you want to do this in base R, you can use :

SquareD <- function(df){
   cols <- sapply(df, is.numeric)
   df[cols] <- lapply(df[cols], square)
   return(df)
}

SquareD(iris)

#  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1           26        12.2          2.0        0.04  setosa
#2           24         9.0          2.0        0.04  setosa
#3           22        10.2          1.7        0.04  setosa
#4           21         9.6          2.2        0.04  setosa
#5           25        13.0          2.0        0.04  setosa
#6           29        15.2          2.9        0.16  setosa
#....

In dplyr,

library(dplyr)
iris %>% mutate(across(where(is.numeric), square))

Or in older version :

iris %>% mutate_if(is.numeric, square)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • As I said, I want to do it in one line, or with a function executable in one line (so not your base solution), and with a method that works for selection/subsetting as well as replacement (so not your dplyr). Or to understand why this isn't possible. What I'm looking for is basically a replacement for writing ```iris[,1:4]``` which would work for any numeric columns. That's why I'm not understanding why ```Filter(is.numeric,df) <-``` doesn't do it for instance – Mobeus Zoom Jun 06 '20 at 09:02
  • @MobeusZoom Sorry I don't understand. Why not `dplyr`? `Filter(is.numeric,df) <-` will not work because there is no method as `Filter<-`. – Ronak Shah Jun 06 '20 at 09:04
  • `Filter` is a function not `Filter<-`. There is a difference in two. – Ronak Shah Jun 06 '20 at 09:05
  • So why doesn't it work? I thought ```Filter``` would operate on dataframe ```df```, select only the numeric columns; then these would be replaced by whatever comes out of the righthand side. But it doesn't happen like that. But if you wrote ```iris[,1:4] <-``` of course that would work – Mobeus Zoom Jun 06 '20 at 09:06
  • because there is no `<-` method defined for `Filter`. You cannot use something which is not defined. `iris[,1:4] <- ` works because `[<-` is defined. Check `?\`[<-\`` – Ronak Shah Jun 06 '20 at 09:07
  • Ok thanks. Cheers for explaining. I already knew about mutate_if. Looking for a way of doing it which reapplies a selection method, to this task. – Mobeus Zoom Jun 06 '20 at 09:11
  • For example, I might want to replace the numerical columns of ANOTHER dataset with squared values from numerical columns of iris. That kind of application – Mobeus Zoom Jun 06 '20 at 09:16
  • @MobeusZoom I don't understand what you mean by reapplies a selection method? If you do `iris <- iris %>% mutate_if(is.numeric, square)` it updates the data or use `iris %<>% mutate_if(is.numeric, square)` if you don't want to assign it back. Please update your post with a specific example and show expected output for it so that is easy to understand. – Ronak Shah Jun 06 '20 at 09:19
  • I've updated the post with this potential application: I might want to replace the numerical columns of ANOTHER dataset (identical dimensions) with squared values from numerical columns of iris – Mobeus Zoom Jun 06 '20 at 09:21
  • Please add an example with expected output. – Ronak Shah Jun 06 '20 at 09:22
0

Applied to a data.frame:

iris_sqrd <- data.frame(Map(function(x) x**2, iris[,sapply(iris, is.numeric)]))

As a function:

# Function: 
square_df <- function(df){data.frame(Map(function(x) x**2, df[,sapply(df, is.numeric)]))}
# Application:
iris_sqrd <- square_df(iris)
hello_friend
  • 5,682
  • 1
  • 11
  • 15