0

I'd like to create a function called g which in turn contains three other functions f1 and f2. Each of the two functions f1, f2 returns a data frame. I would like that the function g returns the two dataframe obtained from f1 and f2. Here is the code that I run:

g <- function(n,a,b,c,d,e) {

    f1 <- function(n,a,b,c,d,e) {
        X <- a*matrix(sample(0:1,n,replace = T),nrow=n,ncol=1)
        Y <- (b*c-d)*matrix(sample(1:10,n,replace = T),nrow=n,ncol=1)
        Z <- (a*e)*matrix(sample(0:12,n,replace = T),nrow=n,ncol=1)
        data1 <- as.data.frame(cbind(X,Y,Z))
        colnames(data1) <- c("X","Y","Z")
        return(data1)
    }
    f1(n,a,b,c,d,e)

    varpredict <- lm(Y ~ 0 + X + Z, data=f1(n,a,b,c,d,e))$fitted.values

    h <- function(){
        olsreg <- lm(Y ~ 0 + X + Z, data=f1(n,a,b,c,d,e))
        P <-  olsreg$residuals^2
        return(P)
    }

    h()

    G <- rep(0,n)
    f2 <- function(n,a,b){
        for (i in 1:n) {
          G[i] <- varpredict[i]-a
        }
        X <- matrix(sample(0:1,n,replace = T),nrow=n,ncol=1)+h()
        Y <- b*matrix(sample(1:10,n,replace = T),nrow=n,ncol=1)
        Z <- (a*b)*matrix(sample(0:12,n,replace = T),nrow=n,ncol=1)-G
        data2 <- as.data.frame(cbind(X,Y,Z))
        colnames(data2) <- c("X","Y","Z")
        return(data2)
    }
    f2(n,a,b)       

    return(list(data1,data2))
}

To run the function g I did this:

n=100
a=0.3
b=0.5
c=0.3
d=-1.32
e=c*d

my_function <- g(n,a,b,c,d,e)

But I received the following error message:

Error in g(n, a, b, c, d, e) : object 'data1' not found

Why am I getting this error?

Brydenr
  • 798
  • 1
  • 19
  • 30

1 Answers1

0

First off, when you call your functions f1 and f2 you need to store their return value somewhere.

Secondly, it’s unclear why you want f1 inside g: it doesn’t seem to share any state with g, so it can be defined independently alongside g instead of inside it. That said, if it’s only ever used inside g then this is to some extent a matter of style.

Here’s how I’d write your code:

g <- function (n, a, b, c, d, e) {
    f1 <- function (n, a, b, c, d, e) {
        X <- a * matrix(sample(0 : 1, n, replace = TRUE), nrow = n)
        Y <- (b * c - d) * matrix(sample(1 : 10, n, replace = TRUE),nrow = n)
        Z <- (a * e) * matrix(sample(0 : 12, n, replace = TRUE), nrow = n)
        data.frame(X, Y, Z))
    }

    f2 <- function(n, a, b) {
        G <- varpredict - a
        X <- matrix(sample(0 : 1, n, replace = TRUE), nrow = n) + square_res
        Y <- b * matrix(sample(1 : 10, n, replace = TRUE), nrow = n)
        Z <- (a * b) * matrix(sample(0 : 12, n, replace = TRUE), nrow = n) - G
        data.frame(X, Y, Z)
    }

    model <- lm(Y ~ 0 + X + Z, data = f1(n, a, b, c, d, e))
    varpredict <- model$fitted.values
    square_res <- model$residuals ^ 2
    list(f1(n, a, b, c, d, e), f2(n, a, b))
}

I’ve cleaned up the code a bit and I hope you’ll agree that it is more readable this way — it is also more efficient, since it avoids recomputing the initial model over and over again. Apart from that I’ve followed a few simple rules: use consistent spacing, don’t use gratuitous abbreviations (T instead of TRUE), use the appropriate functions to avoid redundant code (e.g. the creation of the data frames), no unnecessary use of returns.

But fundamentally I still have no good idea what it does since the variable names don’t provide any useful information. The most impactful improvement for readability is therefore the choice of better variable names.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • Thank you so much Konrad for your suggestion. Your code is perfect. I just have one more question. In my code I need to name the columns of each of my data frame. Do you know how I can deal with that given your code? – Antoine Dedewanou Mar 05 '20 at 23:32
  • @AntoineDedewanou The columns in the code will have automatic names `X`, `Y` and `Z`. If you want to name them differently, use named arguments: `data.frame(your = X, names = Y, here = Z)`. – Konrad Rudolph Mar 06 '20 at 09:25