0

I would like to subset a dataframe based on a test performed. For instance, I ran the test CheckUnsystematic(dat = long, deltaq = 0.025, bounce = 0.1, reversals = 0, ncons0 = 2)

It gave me this:

    > CheckUnsystematic(dat = long, deltaq = 0.025, bounce = 0.1, reversals = 0, ncons0 = 2)
    > CheckUnsystematic(dat = long, deltaq = 0.025, bounce = 0.1, reversals = 0, ncons0 = 2)
     id TotalPass DeltaQ DeltaQPass Bounce BouncePass Reversals ReversalsPass NumPosValues
1     2         3 0.9089       Pass 0.0000       Pass         0          Pass           15
2     3         3 0.6977       Pass 0.0000       Pass         0          Pass           16
3     4         2 0.0000       Fail 0.0000       Pass         0          Pass           18
4     5         3 0.2107       Pass 0.0000       Pass         0          Pass           18
5     6         3 0.2346       Pass 0.0000       Pass         0          Pass           18
6     7         3 0.9089       Pass 0.0000       Pass         0          Pass           16
7     8         3 0.9622       Pass 0.0000       Pass         0          Pass           15
8     9         3 0.8620       Pass 0.0000       Pass         0          Pass           11
9    10         3 0.9089       Pass 0.0000       Pass         0          Pass           12
10   11         3 0.9089       Pass 0.0000       Pass         0          Pass           11

I want to keep only the observations that have a "3" in "TotalPass".

I tried this: CleanAPT <- long[ which(long$TotalPass==3),]

Dave Gruenewald
  • 5,329
  • 1
  • 23
  • 35
J Bacon
  • 27
  • 6
  • 1
    Looks fine, good job. (Not sure how `CheckUnsystematic()` is relevant... is that how you decided on `3`? Is it related to what you need help with? What *do* you need help with? Your code looks fine.) – Gregor Thomas Mar 14 '19 at 18:39
  • I apologize. I should've added that the new dataset, "CleanAPT", shows 0 obs of 3 variables. It isn't registering the = 3 part. The dataset doesn't originally have a TotalPass column and doesnt show one after the test either. How do I get the test to either make a DF or use that to subset my data to only that occurrence? – J Bacon Mar 14 '19 at 18:44
  • Possible duplicate of [Filter data.frame rows by a logical condition](https://stackoverflow.com/questions/1686569/filter-data-frame-rows-by-a-logical-condition) – divibisan Mar 14 '19 at 18:48
  • It seems thay maybe you are showing the output of `CheckUnsystematic()`, not the data frame you are trying to subset. Could you show what `long` looks like? Does it have a column named `TotalPass`? Maybe also include the package information for where `CheckUnsystematic` comes from, I'm not familiar... – Gregor Thomas Mar 14 '19 at 18:48
  • This is correct! That is exactly what I am doing. Long looks like this: `## view the first 20 rows > knitr::kable(long[1:20, ]) | id| x| y| |--:|-----:|--:| | 1| 0.00| NA| | 1| 0.25| NA| | 1| 0.50| NA| | 1| 1.00| NA| | 1| 1.50| NA| | 1| 2.00| NA| | 1| 2.50| NA| | 1| 3.00| NA| | 1| 4.00| NA| | 1| 5.00| NA| | 1| 6.00| NA| | 1| 7.00| NA| | 1| 8.00| NA| | 1| 9.00| NA| | 1| 10.00| NA| | 1| 12.00| NA| | 1| 15.00| NA| | 1| 20.00| NA| | 2| 0.00| 10| | 2| 0.25| 10|` – J Bacon Mar 14 '19 at 18:54
  • It is from the package beezdemand for running behavioral economic tasks. – J Bacon Mar 14 '19 at 18:58

2 Answers2

0

Since you did tag this as a dplyr question, let's use it:

library(dplyr)

check_df <- CheckUnsystematic(dat = long, deltaq = 0.025, 
                              bounce = 0.1, reversals = 0, ncons0 = 2)

CleanAPT <- check_df %>%
  filter(TotalPass == 3)

The reason the CleanAPT <- long[ which(long$TotalPass==3),] is not working is because you are calling on the long dataframe (which is unmodified from the CheckUnsystematic function). In the above, I save the function results to check_df. So, CleanAPT <- check_df[which(check_df$TotalPass==3),] should work.

Merging back with the original data (difficult to say exactly how to do this since column names of long - so assuming id is present and unique), can be done with a semi_join from dplyr:

long_filtered <- long %>%
  mutate(id = as.character(id)) %>%
  semi_join(CleanAPT %>%
              mutate(id = as.character(id)),
            by = "id")
Dave Gruenewald
  • 5,329
  • 1
  • 23
  • 35
  • This works! However, I want the data to be the data before the test. Would the best way to do this to match and merge the id's from the CleanAPT and long df's? – J Bacon Mar 14 '19 at 18:49
  • I think this is really close now, but I am getting the error `Error in semi_join_impl(x, y, by$x, by$y, check_na_matches(na_matches)) : Can't join on 'id' x 'id' because of incompatible types (integer / character)` – J Bacon Mar 14 '19 at 19:02
  • Made changes, although you may need to modify depending on what variables are available in `long` – Dave Gruenewald Mar 14 '19 at 19:07
  • These are the only ones in long `## view the first 20 rows > knitr::kable(long[1:20, ]) | id| x| y| |--:|-----:|--:| | 1| 0.00| NA| | 1| 0.25| NA| | 1| 0.50| NA| | 1| 1.00| NA| | 1| 1.50| NA| | 1| 2.00| NA| | 1| 2.50| NA| | 1| 3.00| NA| | 1| 4.00| NA| | 1| 5.00| NA| | 1| 6.00| NA| | 1| 7.00| NA| | 1| 8.00| NA| | 1| 9.00| NA| | 1| 10.00| NA| | 1| 12.00| NA| | 1| 15.00| NA| | 1| 20.00| NA| | 2| 0.00| 10| | 2| 0.25| 10| ` However, the 'long' df isn't changing obs. after the `semi_join`. – J Bacon Mar 14 '19 at 19:11
  • well in your example, the `id == 4` should be filtered out. `long` should stay the same, but `long_filtered` should be the output you're looking for. if you want, change `long_filtered <- long %>% ...` to `long <- long %>% ...` – Dave Gruenewald Mar 14 '19 at 19:17
  • Right! I am sorry that I am being confusing. I am not even getting a long_filtered df – J Bacon Mar 14 '19 at 19:18
  • Yessir. `check_df <- CheckUnsystematic(dat = long, deltaq = 0.025, bounce = 0.1, reversals = 0, ncons0 = 2) Error: unexpected symbol in: " by = "id") check_df" > > CleanAPT <- check_df %>% + filter(TotalPass == 3) > > long_filtered <- long %>% + mutate(id = as.numeric(id)) %>% + semi_join(CleanAPT %>% + mutate(id = as.numeric(id), + by = "id")` – J Bacon Mar 14 '19 at 19:22
  • whoops! Forgot a `)`. Guess I shouldn't be coding without opening up R :) – Dave Gruenewald Mar 14 '19 at 19:26
  • I thought this much but I don't have the experience to correct anyone. Thank you so much! You have helped tremendously. – J Bacon Mar 14 '19 at 19:29
0

Try this with your long dataset.

CleanAPT <- subset(long, TotalPass == 3)

CheckUnsystematic(dat = CleanAPT, deltaq = 0.025, bounce = 0.1, reversals = 0, ncons0 = 2) 
caszboy
  • 48
  • 1
  • 8
  • I thought it was outputting your data as a data frame for the subset? If not, after you run CheckUnsystematic(), fortify the results using the package ggfortify() like so: fortify("what you named the CheckUnsystematic() output"). After the data frame is created then subset() the data. – caszboy Mar 14 '19 at 19:12