0

Say I have a data frame which is a cross product of the sequence of 1 to 20 with itself:

a <- seq(1,20,1)
combis <- expand.grid(a,a)
colnames(combis) <- c("DaysBack","DaysForward")

So the data looks like:

DaysBack  DaysForward
1         1
2         1
...
19        20
20        20

I want to apply a function which takes the days back, and days forward, and returns several values, and then adds these as columns to this data frame. So my function would look something like:

## operation to apply on each row
do_something <- function(days_back, days_forward)
{
    # logic to work out some values
    ...
    # return those values
    c(value_1, value_2, value_3)
}

And then to add this to the original data frame, so "combis" should for example look like:

DaysBack  DaysForward  Value1  Value2  Value3
1         1            5       6       7
2         1            4       2       3
...
19        20           1       9       3
20        20           2       6       8

How do I do this and get back a data frame.

EDIT:

My do_something function currently operates on two values, days_back and days_forward. It uses these in the context of another dataframe called pod, which (for this example) looks something like:

Date          Price
2016-01-01    3.1
2016-01-02    3.33
...
2016-04-12    2.12

Now say i pass in days_back=1 and days_forward=2, what I do is for each row i find the price 1 day back, and the price 2 days forward, and I add this as a column called Diff to the data. I do this by adding lead/lag columns as appropriate (i found shift code to do this here What's the opposite function to lag for an R vector/dataframe?), so I'm not doing any looping. Once I have the differences per row, I calculate the mean and standard deviation of Diff and return these two values. I.e. for combination days_back=1 and days_forward=2 I have some mean and sd of the diff. Now I want this for all combinations of days_back and days_forward with each ranging from 1 to 20. In the example data i gave when i first asked the question, mean_diff would correspond to Value1 and sd_diff would correspond to Value2 for example

So to be clear, currently my do_something operates directly on two values and not on two sets of column vectors. I'm sure it can be re-written to operate on two vectors, but then again I have the same issue in that I don't know how to return this data so that in the end I get a data frame that looks like what I showed above as my target output.

Thanks

Community
  • 1
  • 1
user555265
  • 493
  • 2
  • 7
  • 18

1 Answers1

2

Something like this

# data
d <- matrix(1,3,2)
# function
foo <- function(x,y) {
  m <- cbind(a=x+1,b=y+2) # calculations
  m # return
} 
# execute the function
res <- foo(d[,1],d[,2])    
# add results to data.frame/matrix
cbind(d,res)

Edit: As you asked in the comments I use your data:

a <- seq(1,20,1)
combis <- expand.grid(a,a)
colnames(combis) <- c("DaysBack","DaysForward")
# function
do_something <- function(x,y) cbind(a=x+1,b=y+2) 
# results
m <- cbind(combis,do_something(combis$DaysBack,combis$DaysForward))
head(m)
DaysBack DaysForward a b
1        1           2 3
2        1           3 3
3        1           4 3
4        1           5 3
5        1           6 3
6        1           7 3
Roman
  • 17,008
  • 3
  • 36
  • 49
  • 1
    You could skip the `m <- ...` and `return` bits and just have `cbind(a=x+1,b=y+2)` as the body of `foo()`. – thelatemail Apr 13 '16 at 10:02
  • with reference to my example, would the line be `m <- cbind(do_something(x,y))` ? .. bearing in mind in my do_something function days_back and days_forward are individual values rather than vectors .. (whereas in your `foo` x and y are the column vectors) – user555265 Apr 13 '16 at 10:13
  • @thelatemail Perfectly right, I only wanted to show that it is possible to save and return values within a function. – Roman Apr 13 '16 at 11:05
  • @user555265 why `days_back` and `days_forward` should be no vectors? They are, and you can calculate with them. See my edits. – Roman Apr 13 '16 at 11:09
  • @Jimbou the way that i use `days_back` and `days_forward` in `do_something` is that I have another data frame which we'll call `pod`. For each row in `pod` i look `days_back` rows back, and `days_forward` rows forward, and calculate a difference which we'll call `diff` .. i think take the mean and sd of this `diff` and return those .. If I pass in a vector then i need some loop over the vector to do this operation, and then it effectively comes back to my original question because i don't know how to get this loop operation to return the data in the appropriate form .. – user555265 Apr 13 '16 at 12:20
  • @user555265 Ok, this it too confusing for me. please edit your question regarding your needs. Despite of what you mentioned, it should be no problem doing such analysis with a further data.frame or vector. – Roman Apr 13 '16 at 12:41
  • @Jimbou I have added an edit to my original question, hope that it makes it clearer – user555265 Apr 13 '16 at 12:58
  • I had to use `list` instead of `cbind` in the `foo` function to avoid losing the factors. Otherwise you will end up with just meaningless integers. I am talking about data frames here and not matrices. – Elmex80s Mar 18 '20 at 17:22