3

Please let me start by providing synthetic data set that shows the issues:

Do <- rep(c(0,2,4,6,8,10,15,20,30,40,45,50,55,60,65,70,80,85,90,92,94,96,98,100), each=16,times=16)
Cl <- rep(c("K", "Y","M","C"), each= 384, times=4)
In <- rep(c("A", "S"), each=3072)
Sa <- rep(c(1,2), each=1536)
Data <- rnorm(6144)
DataFrame <- cbind.data.frame(Do,Cl,In,Sa,Data); head(DataFrame)
rm(Do,Cl,In,Sa,Data)
attach(DataFrame)

Next, I split the 'DataFrame' object into multiple lists to avoid unpredictable recycling. Basically, I am placing each data subset in a separate list so that cycling is predictable and that produced the correct output in my simulator.

DFSplit <- split(DataFrame[ , "Data"], list(Do, Cl, In, Sa))

The 'DFSplit' object has 384 lists

length(names(DFSplit))

Then I created the function 'ids' to identify the lists names

ids <- function(Do, Cl, In, Sa){
    grep( paste( "^" , Do, "\\.",
                Cl, "\\.",
                In,
                "\\.", Sa,sep=""),
         names(DFSplit), value = TRUE)}

mapply(ids, Do, Cl, In, Sa, SIMPLIFY = FALSE)

I understand that each of 'ids' arguments' length is 6144. mapply produces 384 lists each repeated 16 times. How can I change the ids function so that mapply doesn't repeat the same name 16 times. As an ugly and highly costly solution I used unique; i need a better fundamental solution.

unique(mapply(ids, Do, Cl, In, Sa, SIMPLIFY = FALSE))

I also created a function to operate on the 'DFSplit' lists. It has the same issue as the previous function. The thing is, it accepts the previous function as an input.

dG <- function(Do,Cl, In, Sa){
    dg <- 100*
                (1-10^-( DFSplit[[ids(Do,  Cl, In, Sa)]] - DFSplit[[ids(0, Cl, In, Sa)]])) /
                (1-10^-( DFSplit[[ids(100, Cl, In, Sa)]] - DFSplit[[ids(0, Cl, In, Sa)]])) - Do
    dg}

mapply(dG, Do, Cl, In, Sa, SIMPLIFY = FALSE)

What I am trying to do, unsuccessfully if I may say, is to apply the dG function inside each of the 384 lists. I acknowledge that dG function also needs to be modified and I don't know how. I want the input to the dG function to be the names of 384 lists each containing 16 numbers. I want the output to be 384 list with the dG applied.

Please feel free to suggest a different solution all together. The important thing is I need to apply the 'dG' function to the data set.

Ragy Isaac
  • 1,458
  • 1
  • 17
  • 22
  • What is it what you want to accomplish? Your solution looks rather complex... – Paul Hiemstra Mar 16 '13 at 13:08
  • Hi Paul, I have been working on this for a week without success. I am trying to apply the 'dG' function to my data:'DataFrame'. Unfortunately I have not been successful since the arguments' lengths differ and recycling produced incorrect calculation. So I separated the file into lists, identified the name of each list and then applied the function dG. – Ragy Isaac Mar 16 '13 at 13:18
  • I can't really figure out what you're actually trying to accomplish from this code. But I'm surprised that you're surprised that your result is of length 6144. Each argument that you pass to the function `dG` is of length 6144. – joran Mar 16 '13 at 13:32
  • Hi joran: Let me break the first part of the function 'dG' piece by piece using only one list name: ids(2, "C", "A", 1). This produces a vector with sixteen members. if I use all 384 list names, I should get 384 lists each with a vector containing 16 members. Yes, my data set contains 6144, but I am using the 384 list names. – Ragy Isaac Mar 16 '13 at 14:03
  • I can't vote yet, I need a reputation of at least 15 point – Ragy Isaac Mar 16 '13 at 14:21
  • Again, please describe what it is you try to do, and this looks like a job for ddply in the plyr package. And in general, if variables belong together, put them into columns of a data.frame instead of separate vectors. – Paul Hiemstra Mar 16 '13 at 14:53
  • "but I am using the 384 list names": No, you are not. There is nothing in DFSplit named Do, for instance. You attached DataFrame, so any reference to Do will find the 6144 length variable, as with the others. Hence your mapply call returns something of that length, as it should. – joran Mar 16 '13 at 15:09
  • 1
    Some general advice - stop using `attach()`, the characters you save on typing are not as important as the clarity you are losing in what you're trying to do. – Chase Mar 16 '13 at 18:03

1 Answers1

4

Please take a closer look at what you are giving mapply Each object is of length 6144.

  > length(Do)
  [1] 6144
  > length(Cl)
  [1] 6144
  > length(In)
  [1] 6144
  > length(Sa)
  [1] 6144
  > 

You are giving mapply 6144 tuples and asking it to iterate over each.
It is giving you back a list of 6144 elements.

It is exactly what you are telling it to do


Also, just copying and pasting your code yields a list 6144 long, each element containing 16 elements.

  .
  .
  [[6141]]
   [1]  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  1.421085e-14
  [12]  0.000000e+00  0.000000e+00  0.000000e+00 -1.421085e-14  0.000000e+00

  [[6142]]
   [1]  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  1.421085e-14
  [12]  0.000000e+00  0.000000e+00  0.000000e+00 -1.421085e-14  0.000000e+00

  [[6143]]
   [1]  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  1.421085e-14
  [12]  0.000000e+00  0.000000e+00  0.000000e+00 -1.421085e-14  0.000000e+00

  [[6144]]
   [1]  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  1.421085e-14
  [12]  0.000000e+00  0.000000e+00  0.000000e+00 -1.421085e-14  0.000000e+00

Therefore, not 6144 of 1 element as you describe.

You received two very good pieces of advice, one form @Arun and one from @Paul Hiemstra.

Perhaps you can try describing what it is you are attempting to accomplish and folks here can better assist you. Also, please dont forget to look back on your previous questions and up vote and thank those who have given you helpful answers.

Ricardo Saporta
  • 54,400
  • 17
  • 144
  • 178
  • 1
    Thank you very much Paul, you are correct, I modified my post to fine tune my question – Ragy Isaac Mar 16 '13 at 15:40
  • 2
    Hi @RagyIsaac. to be perfectly honest, you might be better off starting a new question from scratch (you can link back to this one, or better yet, just delete it). Try phrasing it as: "(1) This is the result I want (2) This is what I am starting with (3) this is what I have tried (4) This is the problem I have encountered when trying it that way. (5, optional) These are other ideas I have tried, or would like to try but cannot quite figure out" The goal is to give just enough information so that others can help you, without having to give everyone homework just to understand the problem. – Ricardo Saporta Mar 16 '13 at 15:53
  • 1
    Hi Ricardo, I tried to delete it but I could not. I did re-post it. Thanks for your help. – Ragy Isaac Mar 16 '13 at 18:04