0

After much searching, I do not believe this question has been asked, and I figured someone here could find a faster work-around than I can.

I have a factor variable where each observation contains 29 nested numbers:

    dataframe$variable <- [1] -5.04849486  -4.17954852  -7.00735591 -3.93680666 -3.36135959  -5.89856992  -3.28262102  -4.95040133  -4.52287533  -6.55458896 -6.08302617  -5.45365319  -5.35542788  -6.49870823  -9.08663504 -10.83126787 -10.83991976 -10.96286352 -11.47911528 -11.45937234 -10.96341187 -12.33917811 -13.49123764 -13.09288624 -12.53887413 -12.66352061 -14.43587376 -14.96183082 -15.73399282
[2] -6.69620919  -7.40672798  -8.3530468   -7.91598217  -7.83297636  -8.37460146  -8.73916205  -9.20744225  -9.3282853   -9.50299118  -9.92630917 -10.24373155 -10.49520522 -10.41014364 -10.25805992 -10.52294616 -11.27053953 -11.85528256 -12.62743692 -13.35299167 -13.25576965 -13.56397075 -13.70361862 -13.97438053 -14.24976232 -14.10028664 -14.06066972 -14.57621329 -15.45692947 
[3] -3.86805776  -2.57038981  -4.88910112  -3.82336021  -1.51641245  -4.19533412  -3.52909675  -3.86380061  -4.77176809  -4.84617525  -6.59760906  -7.02974036  -6.16868245  -6.74446232  -7.4624311  -7.93993982  -9.27617985 -10.12415032 -10.498118   -10.72502719 -10.71480081 -10.58232787 -11.24845809 -11.24984636 -10.72254205 -11.23331293 -12.7042161  -13.16813511 -14.49287153
etc. 

Now I need to perform some basic computations between each of the observations in the variable (e.g. diff <- [1] - [2]). So I do not want to unlist each number in the observation. They need to function as a single unit.

How can I convert each level of "variable" into a numeric vector so that I can compute the difference between observations? Add commas between each number? Force convert?

Edit: Asking for how to convert the data structure, not perform the calculation. I already have that part written.

MeC
  • 463
  • 3
  • 17
  • do you want [1]-[2] then [1]-[3] or [2]-[3]??? – Onyambu Sep 17 '18 at 18:13
  • I already have a loop that will do the calculation. I just need each observation to be in a numeric structure so that the loop will perform. – MeC Sep 17 '18 at 18:20
  • can you provide a small sample of your data, so we can see the data structure. The approach to solve this question depends a lot on that structure. you can use the `dput` example here: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Matt L. Sep 17 '18 at 21:06

1 Answers1

1

Hard to see the exact data structure that you have (e.g a list of 3 vectors per case?), but this approach or a small modification may work:

library(tidyverse)
dataframe <-
  dataframe %>% 
# data extracted to a vector
  mutate(variable_sep = str_extract_all(variable, "[-0-9\\.]+")) %>% 
# convert to a numeric vector
  mutate(variable_numeric = map(variable_sep,  ~as.numeric(.x)))

The above solution is the "tidyverse" syntax. Here is the regular base R syntax.

dataframe$variable_sep <- str_extract_all(dataframe$variable, "[-0-9\\.]+")
dataframe$variable_numeric <- lapply(dataframe$variable_sep , FUN = as.numeric)
Matt L.
  • 2,753
  • 13
  • 22
  • what is the ~ notation before as.numeric? and what is x supposed to be? – MeC Sep 17 '18 at 20:46
  • It looks like your data will need to be extracted as a "list of vectors", and then you can apply a function to these vectors. My example is using the `purrr::map` which is basically a version of `lapply`. I'll add a base R version which may be more understandable ; a tutorial for `purrr` can be found here: https://jennybc.github.io/purrr-tutorial/ – Matt L. Sep 17 '18 at 21:13
  • the ~formula notation basically is saying to "apply this function (as.numeric)" to each element (.x) in the list of vectors. – Matt L. Sep 17 '18 at 21:18
  • note that `lapply` is just automating the "loop" process so that you don't need to write a loop. – Matt L. Sep 17 '18 at 21:24
  • 1
    that's great. thank you very much! And your 'lapply' syntax actually allowed me to take a for loop out elsewhere in my script. This has kept me puzzling all afternoon so thank you for your solution! – MeC Sep 17 '18 at 21:33
  • fantastic! it looks your data structure may be fairly complex, and I recommend you read through the `purrr` tutorial I linked to above. Even if you prefer the base R version, it was great in helping me understand how to use lapply – Matt L. Sep 17 '18 at 21:40