0

I'm filtering my data with too many types to do this line by lines, wondering if it is possible to make a custom function to do it at once, and how do I do it?

I'm trying to make a norm from my data so I need the mean value and standard deviation of them under multiple different condition, there are like 27*6 conditions I may need to consider.

I'm currently using a lot of lines of select() to filter my data with kinds represented in the column names like this

se_fn_F1 <- se_fn %>% select(ParticipantName,contains('F1'))
se_fn_F2 <- se_fn %>% select(ParticipantName,contains('F2'))
se_fn_F3 <- se_fn %>% select(ParticipantName,contains('F3'))
se_fn_F4 <- se_fn %>% select(ParticipantName,contains('F4'))
se_fn_B1 <- se_fn %>% select(ParticipantName,contains('B1'))
se_fn_B2 <- se_fn %>% select(ParticipantName,contains('B2'))
se_fn_B3 <- se_fn %>% select(ParticipantName,contains('B3'))
se_fn_B4 <- se_fn %>% select(ParticipantName,contains('B4'))
se_fn_B5 <- se_fn %>% select(ParticipantName,contains('B5'))
se_fn_B6 <- se_fn %>% select(ParticipantName,contains('B6'))

Obviously this is no a good way to do things like this, but I don't know how can I do it with custom function, need some advises

I realized that there are some information I should've provided. The name of original variables is like this "Single10_F2_FixationTime", which shows the three different category the variable belongs to.

計柏毅
  • 11
  • 1
  • 4
  • 3
    Sounds like your should convert your data to long format with `tidyr::gather` or `reshape2::melt` and work do grouped operations. Saving all those separate data frames seems like a terrible idea. – Gregor Thomas Apr 08 '19 at 15:19
  • 2
    It's hard to help without seeing what you are doing next... – Gregor Thomas Apr 08 '19 at 15:20
  • I've tried to clarify my question, hoping that would make this more understandable – 計柏毅 Apr 08 '19 at 15:30
  • 1
    Creating all these variables in your workspace is just going to make them ultimately harder to work with. It's easier to store related values in lists in R. But it depends on why you think you need all these variables. – MrFlick Apr 08 '19 at 15:51
  • I do realized this problem too, but I can't really think of some better ways to do this. The categorical information of this data is all stored in column names,while each row represents a subject. Maybe there are some way to use it to split them into a 27*6 data frame? – 計柏毅 Apr 08 '19 at 16:12
  • My first comment is a better way to do that. Take the information out of column names, put it in a single column instead. `gather` or `melt` do this for you. [Here's the R-FAQ on that](https://stackoverflow.com/q/2185252/903061) (suggested dupe). – Gregor Thomas Apr 08 '19 at 17:14
  • I am going to close as a dupe. Without sample data, I don't think anything can be done to answer this question in a better way. If you have trouble and need more help, please make your question reproducible by sharing an illustrative subset of your data. See tips on reproducibility [at this R-FAQ](https://stackoverflow.com/q/5963269/903061). Using `dput()` to share 5-10 rows of just a few columns (e.g., the contains 'F1' and the contains 'B2' groups) would be plenty. – Gregor Thomas Apr 08 '19 at 17:18
  • Sorry for a badly asked question. I get the result what i needed with `reshape2::melt` and `seperate()`. Thanks for the helps – 計柏毅 Apr 08 '19 at 18:17

1 Answers1

0

Try this one, I hope this will give you the result in the first.

stk_dta <- se_fn %>%
    gather(variable, value, F1:F4, B1:B6)

result <- lapply(stk_dta$variable, function(x) {
    stk_dta %>% filter(variable == x) %>% spread(variable, value)
})
TheRimalaya
  • 4,232
  • 2
  • 31
  • 37